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1 Introduction 


1.1 The problem 

The quotation from Dernmel opening this article, though possibly puzzling for those 
who day-to-day satisfactorily solve eigenvalue problems, summarizes a long-standing 
open problem in numerical linear algebra. The first algorithm that comes to mind 
for computing eigenvalues —to compute the characteristic polynomial xa of A and 
then compute (i.e., approximate) its zeros— has proved to be numerically unstable 
in practice. The so called Wilkinson’s polynomial, 

20 

w{x) := JJ(.x — i ) = x 20 + W 19 X 19 + • • • + w\x + wq 
i =1 

is often used to illustrate this fact. For a diagonal matrix D with diagonal entries 
1 , 2 ,..., 20 (and therefore with xd(x) = w(x )) an error of 2 -23 in the computation 
of W 19 = —210 produces, even if the rest of the computation is done error-free, 
catastrophic variations in the zeros of xd- For instance, the zeros at 18 and 19 
collide into a double zero close to 18.62, which will unfold into two complex conjugate 
zeros if the error in w\$ is further increased. And yet, there is nothing wrong in 
the nature of D (in numerical analysis terms, as we will see below, D is a well- 
conditioned matrix for the eigenvalue problem). The trouble appears to he in the 
method. 

Barred from using this immediate algorithm due to its numerical instability, 
researchers devoted efforts to come up with alternate methods which would appear 
to be stable. Among those proposed, the one that is today’s algorithm of choice is 
the iterated QR with shifts. This procedure behaves quite efficiently in general and 
yet, as Dernmel pointed out in 1997 [22, p. 139], 

after more than 30 years of dependable service, convergence failures of this 
algorithm have quite recently been observed, analyzed, and patched [...]. But 
there is still no global convergence proof, even though the current algorithm is 
considered quite reliable. 

Our initial quotation followed these words in Dennnel’s text. It asked for an algo¬ 
rithm which will be numerically stable and for which, convergence, and if possible 
small complexity bounds, can be established. Today, 17 years after Dennnel’s text, 
this demand retains all of its urgency: it is not known if any of the standard nu¬ 
merical linear algebra algorithms satisfies the properties above. For example: 

• The unshifted QR algorithm terminates with probability 1 but is probably 
infinite average cost if approximations to the eigenvectors are to be output 
(see [29]). 

• The QR algorithm with Rayleigh Quotient shift fails for open sets of real input 
matrices (see [ 6 , 7]). 
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• We do not know whether the Francis (double) shift algorithm converges gen¬ 
erally on real or complex matrices, nor an estimate of its average cost. 

• Other algorithms in modern texts are analyzed but don’t estimate the (nec¬ 
essarily infinite in the worst case) number of iterations, which usually relies 
on experimental results, see for example [36] which uses a divide and conquer 
algorithm or [23] which is on turn based on the algorithm in [5]. 

Algorithms which output approximate eigenvalues without accompanying ap¬ 
proximate eigenvectors might be easier to analyze. The experimental evidence of [37] 
for symmetric matrices suggests that many of the algorithms in use are of average 
finite cost and even that there is some universality. An informal explanation of this 
fact is that the eigenvalues of symmetric matrices are very well conditioned, see for 
example [52, eq. (1.5)]. But eigenvectors are another matter. When the matrices 
are close to having multiple eigenvalues the condition of the eigenvector tends to 
infinity. For example, even for 2x2 symmetric matrices, any pair of orthogonal 
vectors (a, b) and (—6, a) are the eigenvectors of a matrix 

fl + £l £3 \ 

V e 3 1 + £ 2 ) 

for |£j|, i = 1,2,3, arbitrarily small. 

The only goal of this paper is to give a positive answer to Demmel’s question. 
The set of our main results can be informally stated as follows. 

main results We exhibit algorithms which on input a complex matrix A with 
complex Gaussian entries generate (with probability 1) an “excellent” approximation 
to one of (or all) the (eigenvalue, eigenvector) pairs of A. Some of these algorithms 
are deterministic while some are randomized. Their running time (expected running 
time for the randomized case) is polynomial in n on average (w.r.t. A) as well as 
under a standard smoothed analysis. 

More precisely, the average complexity bounds we prove, for n x n matrices, 
are 0(n 7 ) —for the computation of a single eigenpair with either a deterministic or 
a randomized algorithm— and 0(n 9 ) —for the computation of all eigenpairs with 
a deterministic algorithm. Note that these are just upper bounds: the practical 
performance of the algorithms might be better. The precise statements of the main 
results are in Theorems 2.24, 2.27, and 2.34. 

1.2 A few words on approximations 

It must be said upfront that we do not think the algorithm we propose will out¬ 
perform, in general, iterated QR with shifts. It nonetheless possesses some worthy 
features which we want to describe in this introduction. The key one, we already 
mentioned, is that both convergence and complexity bounds can be established for 
it. It is also numerically stable. But in addition, it is strongly accurate. 
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A starting point to understand the meaning of this last claim, is the observation 
that there are two different obstructions to the exact computation of an eigenpair. 
Firstly, the use of finite precision, and the ensuing errors accumulating during the 
computational process. The expression numerically stable is usually vested on al¬ 
gorithms for which this accumulated error on the computed quantities is not much 
larger than that resulting from an error-free computation on an input datum which 
has been read (and approximated) with machine precision. Secondly, the nonlinear 
character of the equations defining eigenvalues and eigenvectors in terms of the given 
matrix. For n > 5, we learned from Abel and Galois, we cannot write down these 
eigenvalues in terms of the matrix’ entries, not even using radicals, and the same 
remains true for eigenvectors. Hence, we can only compute approximations of them 
and this is so even assuming infinite precision in the computation. 

The expression strongly accurate refers to the quality of these approximations. It 
is common to find in the literature (at least) three notions of approximation which 
we next briefly describe. To simplify, we illustrate with the computation of a value 
C € C from a datum A £ C N (and the reader may well suppose that this computation 
is done with infinite precision). We let C be the quantity actually computed and we 
consider the following three requirements on it: 

Backward approximation. The element ( is the solution of a datum A close to A. 
Given e > 0, we say that ( is an e-backward approximation when ||A —A|| < e 
(resp. ||A — A|| < e||A|| if we are interested in relative errors). 

Forward approximation. The quantity C is close to C- Given e > 0, we say that C 
is an e-forward approximation when |C — Cl < £ (resp. |C — Cl < £|C|)- 

Approximation a la Smale. An appropriate version of Newton’s iteration, starting 
at C, converges immediately, quadratically fast, to C> either in absolute or 
relative error. 

These requirements are increasingly demanding. For instance, if C is an e-backward 
approximation then the forward error |C — Cl is bounded, roughly, by econd(A). 
Here cond(A) is the condition number of A, a quantity usually greater than 1. So, 
in general, e-backward approximations are not e-forward approximations, and if A 
is poorly conditioned C nray be a poor forward approximation of C- We also observe 
that if C is an approximation a la Smale we can obtain an e-forward approximation 
from C by performing 0(log | loge|) Newton’s steps. Obtaining an approximation a 
la Smale from an e-forward approximation is a much less obvious process. 

When we say that our algorithm is strongly accurate, we refer to the fact that 
the returned eigenpairs are approximations a la Smale of true eigenpairs. 

In our case, we can not only efficiently compute e-forward approximations as 
above but also with respect to relative error. These are pairs (C, w) satisfying 

dp(w,v) < s and |C — A| < e|A| 
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for some true eigenpair (A, v) of A. Here dp(w, v ) denotes the angle between w 
and v (note that here the scaling of eigenvectors renders moot the relativization of 
the error). 

Theorem 1.1 We exhibit an algorithm that, given a matrix A £ C nxn , an approx¬ 
imate eigenpair (£, w) returned by any of the algorithms in the main results, and an 
accuracy e £ (0,1/2), produces an e-forward approximation (in relative error) of the 
approximated true eigenpair (X,v). The algorithm terminates provided A / 0. Its 
average cost over Gaussian matrices A (independently of the chosen approximate 
eigenpair) is 0(n 3 log 2 log 2 (n/e)). 

Combining our main results with Theorem 1.1 we can thus compute e- forward 
approximations (in relative error) of all eigenpairs of random Gaussian matrices with 
average running time 0(n 9 + n 3 log 2 log 2 (n/e)). See §11 for a proof of Theorem 1.1. 

1.3 A few words on complexity 

The cost, understood as the number of arithmetic operations performed, of com¬ 
puting an approximation of an eigenpair for a matrix A £ C nxn , depends on the 
matrix A itself. Actually, and this is a common feature in numerical analysis, it 
depends on the condition cond(yl) of the matrix A. But this condition is not known 
a priori. It was therefore advocated by Smale [47] to eliminate this dependency 
in complexity bounds by endowing data space with a probability distribution and 
estimating expected costs. This idea has its roots in early work of Goldstine and 
von Neumann [53]. 

In our case, data space is C nxn , and a common probability measure to endow it 
with is the standard Gaussian. Expectations of cost w.r.t. this measure yield expres¬ 
sions in n usually referred to as average cost. A number of considerations, including 
the suspicion that the use of the standard Gaussian could result in complexity 
bounds which are too optimistic compared with “real life”, prompted Spielman and 
Teng to introduce a different form of probabilistic analysis, called smoothed analysis. 
In this, one replaces the average analysis’ goal of showing that 

for a random A it is unlikely that the cost for A will be large 

by the following 

for all A, it is unlikely that a slight random perturbation A = A + A A 
will require a large cost. 

The expectations obtained for a smoothed analysis will now be functions of both 
the dimension n and some measure of dispersion for the random perturbations (e.g., 
a variance). 

Smoothed analysis was first used for the simplex method of linear program¬ 
ming [50]. Some survey expositions of its rationale are in [49, 51, 14]. One may 
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argue that it has been well accepted by the scientific community from the fact that 
Spielman and Teng were awarded the Godel 2008 and Fulkerson 2009 prizes for it 
(the former by the theoretical computer science community and the latter by the 
optimization community). Also, in 2010, Spielman was awarded the Nevanlinna 
prize, and smoothed analysis appears in the laudatio of his work. 

In this paper we will exhibit bounds for the cost of our algorithm both for average 
and smoothed analyses. 

1.4 A few words on numerical stability 

The algorithm we deal with in this paper belongs to the class of homotopy contin¬ 
uation methods. Experience has shown that algorithms in this class are very stable 
and stability analyses have been done for some of them, e.g. [13, 9, 20]. Because 
of this, we will assume infinite precision all along this paper and steer clear of any 
form of stability analysis. We nonetheless observe that such an analysis can be easily 
carried out following the steps in the papers mentioned above. 

1.5 Previous and related work 

Homotopy continuation methods go back, at least, to the work of Lahaye [30]. A 
detailed survey of their use to solve polynomial equations is in [31]. More explicit 
focus in eigenvalue computations is considered in [18, 32, 33, 34] but we do not know 
of any serious attempt to implement them. 

In the early 1990s Shub and Smale set up a program to understand the cost of 
solving square systems of complex polynomial equations using homotopy methods. 
In a collection of articles [41, 42, 43, 44, 45], known as the Bezout series , they put 
in place many of the notions and techniques that occur in this article. The Bezout 
series did not, however, conclusively settle the understanding of the cost above, and 
in 1998 Smale proposed it as the 17th in his list of problems for the mathematicians 
of the 21st century [48]. The problem is not yet considered fully solved by the 
community but significant advances appear in [10, 11, 15]. 

The results in these papers cannot be directly used for the eigenpair problem 
since instances of the latter are ill-posed as polynomial systems. But the intervening 
ideas can be reshaped to attempt a tailor-made analysis for the eigenpair problem. 
A major step in this direction was done by Armentano in his PhD thesis (see [3] 
and its precedent [21]), where the condition number jj, for the eigenpair problem was 
exhaustively studied. A further step was taken in [4] where n was used to analyze 
a randomized algorithm for the Hermitian eigenpair problem. 

Our paper follows this stream of research. 
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1.6 Structure of the exposition 

The remainder of this paper is divided into two parts. In the first one, §2 below, we 
introduce all the technical preliminaries, we describe with details the algorithms, and 
we state our main results (Theorems 2.24, 2.27, and 2.34). The condition number /j, 
Newton’s method, the notion of approximate eigenpair, and Gaussian distributions 
are among these technical preliminaries. The second part, which occupies us in the 
subsequent sections, is devoted to proofs. Some short proofs are included in §2 as 
well. 


2 Preliminaries, Basic Ideas, and Main Result 


2.1 Spaces and Metric Structures 

Let C nxn be the set of n x n complex matrices. We endow this complex linear space 
with the restriction of the real part of the Frobenius Hermitian product ( , )f and 
the associated Frobenius norm || • \\f given by 

n 

(A,B) f ■= trace {B*A) = ^ (Hjbij, ||d|| F = (A,A) 1/2 , 

i,j =i 


where A = () and B = (bij). The unit sphere will be denoted by §(C nxn ) 
or simply by §. We endow the product vector space C nxn x C with the canonical 
Hermitian inner product structure and its associated norm structure and (Euclidean) 
distance. 

The space C" is equipped with the canonical Hermitian inner product ( , ). We 
denote by P(C n ) its associated projective space. This is a smooth manifold which 
carries a natural Riemannian metric, namely, the real part of the Fubini-Study metric 
on P(C n ). The Fubini-Study metric is the Hermitian structure on P(C n ) given in 
the following way: for x € C n , 


(w,w') x 


( w , w') 


( 1 ) 


for all w, w' in the Hermitian complement x ± of x in C n . We denote by dp the 
associated Riemannian distance. An explicit formula for that distance (see for ex¬ 
ample [12, p. 226]) is 

dp(v, w) = arccos —-—-—-. (2) 

\\v\\ ■ ||w|| 

Note that this formula makes sense for v, w € C n (as the distance between the 
respective projective classes). 

The space C nxn x C x P(C n ) is endowed with the Riemannian product structure. 
The resulting distance equals 


dist((A,A,v),(A / ,A',v / )) 2 


||A - A'\\p + |A - A'| 2 + d P (v,v') 2 . 
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We will only use this distance on § x C x P(C n ). Note that for A, A' £ S, the distance 
dist((x4, A, v), (A',y,v')) is smaller than or equal to the natural geodesic (product) 
distance in § x C x P(C n ). For any nonzero matrix A £ C nxn ( no t necessarily of 
unit norm) we will write 

disU((A, v), (A',v')) 2 := ^ ^ + dp(v, v') 2 . 

Note that for any nonzero A £ C nxn . dist^ is a distance function in C x P(C n ) and 
if A £ §, then dist y i((A, v ), (A',v')) = dist((xl, A, v), (A, A', v')). 

2.2 The Varieties V, W, S' and E 

We define the solution variety for the eigenpair problem as 

V = V n := {(A,A,v) £ C nxn x C x P(C") : (A - Ald)u = 0} . 

Proposition 2.1 The solution variety V is a smooth submanifold of C nxn x C x 
P(C n ), of the same dimension as C nxn . 

Proof. See [3, Proposition 2.2]. □ 

The set V inherits the Riemannian structure of the ambient space. Associated 
to it there are natural projections: 



C nxn C x P(C n ). 


Because of Proposition 2.1, the derivative Diti at (A, A, v) is a linear operator be¬ 
tween spaces of equal dimension. The subvariety W of well-posed, triples is the 
subset of triples (A,A,v) £ V for which Dtti(A, A, v) is an isomorphism. In par¬ 
ticular, when (A, A, v ) £ VV, the projection tt\ has a branch of its inverse (locally 
defined) taking A € C nxn to (A, A, v) £ V. 

For v £ P(C") we denote by v 2 - = {x £ C" | ( x , v ) = 0} the tangent space to 
P(C n ) at v. Let P v ± : C" —>• r -1 be the orthogonal projection. Given (A, A, v ) £ 
C nxn x C x P(C n ), we let A\ v : v 2 - —>• v 2 - be the linear operator given by 

A x ,v : =P,xo(A-Ald)U. (4) 

If we choose a representative such that ||w|| = 1 and we assume that A\ )V is invertible, 
then we have 


ic n A\, v P v ±. 


(Id — vv*)(A — Aid) (Id — vv*) 


(5) 



and 


= (( ld “ VV *)( A ~ Aid)(Id - to*)) 1 ", (6) 

where io : v 1 ' —>• C n is the inclusion and 1 denotes Moore-Penrose pseudoinverse. 
The set of well-posed triples is exactly 

W = {(A,A,u) G V : A\ tV is invertible}, (7) 

see [3, Lemma 2.7]). We define £' := V \ W to be the variety of ill-posed, triples , 
and £ = vri(S') C C nxn the discriminant variety , i.e., the subset of ill-posed inputs. 

Remark 2.2 From (7) it is clear that the subset £' is the set of triples (A, A, v) € V 
such that A is an eigenvalue of A of algebraic multiplicity at least 2. Hence 
£ is the set of matrices A € C nxn with multiple eigenvalues, and for A € 
C nxn \ £ ; the eigenvalues of A are pairwise different and 7 r] _ 1 (H) is the set of triples 
(A, Ai, ui),..., (A, X n ,v n ), where (A*, Uj), i = 1,..., n, are the eigenpairs of A. 

Proposition 2.3 The discriminant variety £ C C nxn is a complex algebraic hy¬ 
persurface. Consequently, for all n > 2, we have dim® £ = 2 n 2 — 2. 

Proof. See [16, Proposition 20.18]. □ 


2.3 Unitary invariance 

Let U n be the group of nxn unitary matrices. The group U n naturally acts on 
P(C n ) by U ■ [u>] := [Uw\. In addition, U n acts on C nxn by conjugation (i.e., 
U ■ A := UAU- 1 ), and on C nxn x C by U ■ (A, A) := (L/xlt/- 1 , A). These actions 
define an action on the product space C nxn x C x P(C n ), namely, 

U- (A,X,v) := (UAU~ l ,\,Uv). (8) 

Remark 2.4 The varieties V, W, £', and £, are invariant under the action of lA n 
(see [3] for details). 

2.4 Condition of a triple 

In a nutshell, condition numbers measure the worst possible output error resulting 
from a small perturbation on the input data. More formally, a condition number is 
the operator norm of the derivative of a solution map such as the branches of 7r _1 
mentioned in §2.1 above, (see [16, §14.1.2] for a general exposition). 

In the case of the eigenpair problem, one can define two condition numbers for 
eigenvalue and eigenvector, respectively, and formulas for both of them have been 
known at least since [52], Armentano has shown that one can merge the two in 
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a single one (see §3 in [3] for details). Deviating slightly from [3], we define the 
condition number of ( A , A, v) £ C nxn x C x C n as 

p(A,\,v) :=\\A\\ F \\A^ V \\, (9) 

(or oo if A\ v is not invertible) where || • || is the operator norm. This coincides with 
p v (A,\,v) in [3]. Note that from (6), if {A, \,v) is such that p(A,X,v) < oo, and 
Hull = 1 then: 


K A , A,u) 


UWf ((Id 


vv*)(A — Aid) (Id 


vv*))^ 


( 10 ) 


Remark 2.5 The condition number p is invariant under the action of the unitary 
group U n , i.e., p(UAU ~ l , X,Uv) = p(A,X,v) for all U £ U. n . and scale invariant on 
the first two components, i.e., p(sA, s A, v) = p(A, A, v) for ah s € C \ {0}. 

Lemma 2.6 (Lemma 3.8 in [3]) For (A, X,v) £ V we have p(A,X,v) > □ 

The essence of condition numbers is that they measure how much may outputs 
vary when inputs are slightly perturbed. The following result, which we will prove 
in §3, quantifies this property for /a. 


Proposition 2.7 Let T : [0,1] —> V, T(t) = (A t , A t,v t ) be a smooth curve such that 
At lies in the unit sphere of C nxn , for all t. Write pt '■= /i(T(f)). Then we have, for 
all t £ [0,1], 

I'M < \J 1 +Ft Pt||, ||h t || < p t \\A t \\. 

In particular, 

||r(t)|| <Vb p t \\A t \\. 

Condition numbers are generally associated to input data. In the case of a 
problem with many possible solutions (of which returning an eigenpair of a given 
matrix is a clear case) one can derive the condition of a data from a notion of 
condition for each of these solutions. A discussion of this issue is given in [16, §6.8]. 
For the purposes of this paper, we will be interested in two such derivations: the 
maximum condition number of A, 


Pmax(A) .— 


max p(A,X j ,v j ), 

l<j<n 


and the mean square condition number of A, 


Pav{A) 





1 

2 



e Minion 


1 

2 


Condition numbers themselves vary in a controlled manner. The following Lip- 
schitz property and its corollary make this statement precise. 
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Theorem 2.8 Let A, A' £ C nxn be such that ||^4||_f = H^IIf = 1, let v,v' £ C n be 
nonzero, and let A, A' € C. Suppose that 


p{A,X,v) d\st((A,X,v), (A',X',v')) < 


£ 

471 


for some £ £ (0,1). 


Then we have 


Y^T7 h( A ,X, y ) < p{A',X',v') < 


(1 + £)p(A,\,v). 


Corollary 2.9 Let A £ S, A 0 X, and let A' £ § be such that 


||d — A'\\f < 


50 p 2 max (Ay 


for some £ £ (0,1). 


Then, A' 0 £ and (1 + e) Vmax(d) < /tmax(d') < (1 + e)/hnax(d). 

We give the proofs of Theorem 2.8 and Corollary 2.9 in §4 below. 

We close this paragraph observing that restricted to the class of normal matrices, 
the condition number p admits the following elegant characterization. 


Lemma 2.10 (Lemma 3.12 in [3]) Let A £ C nxn \ S be normal, and let 
(Ai, ui),..., (A n , v n ) be its eigenpairs. Then 


p(A, Ai,ui) 


UWf 

min J= 2 ,..., n | A j - Xi 


□ 


2.5 Newton’s method and approximate eigenpairs 

For a nonzero matrix A £ C nxn , we define the Newton map associated to A, 

N a : C x (C n \ {0}) 4Cx(C n \ {0}), 


by 


N a 





where 


Q) = (BC4(A,„) 


and F a (X, v) = {A — Ald)u is the evaluation map. This is a rational map (it is only 
defined on an open subset of C x (C n \ {0})). It was introduced in [3] as a Newton¬ 
like operator associated to the evaluation map F A , and the following formulas were 
obtained for v and A (recall the definition of A\ tV from (4)): 


v = A x , v 1 P v ±Av, 


((A Id — A){v — v),v) 
(v,v) 


( 11 ) 
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The map IV 4 is defined for every (A, v) € C x (C n \ {0}) such that Ay v is invertible. 


N: 







( 12 ) 


where the superindex k means k iterations. See [3, Sec. 4] for more details. 

The notion of approximate solution as a point where Newton’s method con¬ 
verges to a true solution immediately and quadratically fast was introduced by 
Steve Srnale [46]. It allows to elegantly talk about polynomial time without depen¬ 
dencies on pre-established accuracies. In addition, these approximate solutions are 
“excellent approximations” (as mentioned in the statement of the main results) in a 
very strong sense: the distance to the exact solution dramatically decreases with a 
single iteration of Newton’s method. In the context of eigenpair computations this 
concept is settled as follows. 


Definition 2.11 Given (A,X,v) e W we say that (£,u>) G C x (C n \ {0}) is an 
approximate eigenpair of A with associated eigenpair (A, v) when for all k > I the 
/cth iterate N^((,w) of the Newton map at (£ ,w) is well defined and satisfies 

/ 1 \ 2*- 1 

d\st A {(NX((,w)),(\,v)) < {^-J disu((C,uO,(A,u)). 

The following result estimates, in terms of the condition of an eigenpair, the radius 
of a ball of approximate eigenpairs associated to it. For a complete proof see [3, 
Theorem 5]. 

Theorem 2.12 There is a universal constant Co > 1/5 with the following property. 
Let (A, A, v) G W with || A\\p = 1 and let (£, w) G C x (C n \ {0}). If 

dist^A,„),«,»)) < 

then ((,w) is an approximate eigenpair of A with associated eigenpair (A,u). 

It is a simple exercise to check that for any nonzero z£C, (£, w) is an approximate 
zero of A with associated zero (A, v) if and only if (z£, w) is an approximate zero of 
zA with associated zero (zA, v). So from the point of view of analyzing the effect of 
the Newton methods we may pick whatever scaling is convenient. For us it will be 
convenient to assume that ||H||j? = 1 which we will do in the following. 

Proof of Theorem 2.12. Note that [3, Theorem 5] is the same result with 
Co = 0.2881, but the definition of the condition number in [3] is slightly different 
from ours. More exactly, if we denote by k(A, A, v) the condition number defined in 
[3] then we have k(A,X,v) = max(l, p(A, A, v)). Theorem 2.12 is hence true with 
k in the place of p and Co = 0.2881. However, from Lemma 2.6 we know that 
p(A,X,v) > 2 -1 / 2 which readily implies k(A,X,v) < y/2p(A,X,v). Theorem 2.12 
now follows from the fact that 0.2881 > \/2/5. □ 
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Remark 2.13 We note that ((, w) can be computed from the matrix A and the 
pair (C,w) in 0(n 3 ) operations, since the cost of this computation is dominated by 
that of inverting a matrix (or simply solving a linear system). 


2.6 Gaussian Measures on C nxn 

Let a > 0. We say that the complex random variable Z = X + \[—VY has distribu¬ 
tion A/"c(0, a 2 ) when the real part X and the imaginary part Y are independent and 

identically distributed (i.i.d.) drawn from Af(0, i.e., they are Gaussian centered 

2 

random variables with variance 

If Z ~ A/c(0, a 2 ) then its density C 
measure is given by 


with respect to the Lebesgue 


A z ) : = 


1 


7T(7 


We will write v ~ A/"c»(0, a 2 ) to indicate that the vector v € C” is random with 
i.i.d. coordinates drawn from A/c(0, a 2 ). Also, we say that A G C mxn is (isotropic) 
Gaussian and we write A Mb mXn (0, (j 2 ), if its entries are i.i.d. Gaussian random 
variables. The resulting probability space is sometimes called the Ginibre ensemble. 

If A G C mxn and G ~ A f^mxn (0, a 2 ), we say that the random matrix A = G + A 
has the Gaussian distribution centered at A, and we write A ■Me mXn (i,u 2 ). The 
density of this distribution is given by 


A,cr 

V'Vtixn 


(A) := 


1 


IIA—A|If 
—772 - 


■Me mXn in the particular case where 


(7T(J 2 ) mn 

For conciseness, we will sometimes write A 
A = 0 and a = 1. 

Crucial in our development is the following result giving a bound on the average 
condition for Gaussian matrices arbitrarily centered. Its statement is similar to the 
main technical result in [16, Thm. 3.6]. We will prove it in §7. 

For technical reasons we will be interested in the following variation of //: 

Hf{A,\,v) := PI|f||A^||f 

(note, we only replaced ||A^].|| by ||A)(^||f) and the corresponding 


7 t F,av(A) :— | ^ ^ ^ fJ,p(A , A j , 


3 =1 


Theorem 2.14 For A £ C nxn and a > 0 we have 

,2 

/A 

E 




C nx ' 


M,° 2 ) 


(Ff, av(^)\ ^ n 

' Pill > - 
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Moreover, for A chosen with the uniform distribution (§) in the unit sphere § of 
C nxn we have: 


IE (hl.aM)) < 

S) 


Remark 2.15 (i) We note that no bound on the norm of A is required in the first 
claim of Theorem 2.14. Indeed, using pF,av(sA) = pF,av(A), it is easy to see 
that the assertion for a pair {A, o) implies the assertion for (sH, so), for any 
s > 0. 

(ii) It is remarkable that if we change Gaussian matrices to some classes of struc¬ 
tured matrices, the expected value of the condition number can be very high, 
see for example [8] and references therein. 


2.7 Truncated Gaussians and smoothed analysis 

For T, o > 0, we define the truncated Gaussian A/cnxn T (0, a 2 ) on C nxn to be the 
distribution given by the density 


Pt{A) 


Ft, a 

0 


if PIIf<t, 

otherwise, 


(13) 


where P T ,a := P ro b 4 x ,„(o,ct 2 ){||71||f < T}, and, we recall, tp^ n is the density of 
A/"oxn(0, <7 2 ). For the rest of this paper we fix the threshold T := \/2n. The fact 
that ||d|||, is chi-square distributed with 2 n 2 degrees of freedom, along with [17, 
Corollary 6] yields the following result. 


Lemma 2.16 We have Pr,a > \ for all 0 < o < 1. 


□ 


2 

The space C nxn of matrices with the Frobenius norm and the space C n with the 
canonical Hermitian product are isomorphic as Hermitian product spaces. Hence, 
the Gaussian J\f C nxn(0, o 2 ) on the former corresponds to the Gaussian J\f c „2 (0, a 2 ) 
on the latter, and we can deduce invariance of A/"c«x™(0, o 2 ) under the action of U n 2 
(in addition to that for conjugation under U n discussed in §2.3), and the same is true 
for the truncated Gaussian. In particular, the pushforward of both distributions for 
the projection C nxn \{0} —^ S, x4 1 — 5 - is the uniform distribution (§) (see [16, 

Chapter 2] for details) and we have 

E F(A) = E F{A) = E F(A), (14) 

-4~A/' c „xn(0,cr 2 ) A~A/' c7 ixn T (0,cr 2 ) A~^(S) 

for any measurable scale invariant function F : C rixn —>• [0,oo). 

Complexity analysis has traditionally been carried out either in the worst-case 
or in an average-case. More generally, for a function F : M m —>• M + (some measure 
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for the computational cost of solving an instance in M m ), the former amounts to the 
evaluation of sup agRm F(a) and the latter to that of E a ~s> F(a) for some probability 
distribution on M m . Usually, is taken to be the standard Gaussian in the 
input space. With the beginning of the century, Daniel Spielman and Shang-Hua 
Teng introduced a third form of analysis, smoothed analysis, which is meant to 
interpolate between worst-case and average-case. We won’t elaborate here on the 
virtues of smoothed analysis; a defense of these virtues can be found, e.g., in [49, 51] 
or in [16, §2.2.7]. We will instead limit ourselves to the description of what smoothed 
analysis is and which form it will take in this paper. 

The idea is to replace the two operators above (supremum and expectation) by 
a combination of the two, namely, 

sup E F(a) 

aSR m a~®(a,cr) 

where SHa, a) is a distribution “centered” at a having a as a measure of dispersion. 
A typical example is the Gaussian N(a, a 2 ). Another example, used for scale invari¬ 
ant functions F, is the uniform measure on a spherical cap centered at a and with 
angular radius a on the unit sphere §(M m ) (reference [16] exhibits smoothed anal¬ 
yses for both examples of distribution). In this paper we will perform a smoothed 
analysis with respect to a truncated Gaussian. More precisely, we will be interested 
in quantities of the form 


sup E F(A) 

A£C nxn A~Af c „ x „, T (A,cr 2 ) 

where F will be a measure of computational cost for the eigenpair problem. We note 
that, in addition to the usual dependence on n, this quantity depends also on a and 
tends to oo when a tends to 0. When F is scale invariant, as in the case of /r av or 
/imax, it is customary to restrict attention to matrices of norm 1. That is, to study 
the following quantity: 

sup E F(A). (15) 

AES A~A/' C nxn iT (h,Cr 2 ) 

2.8 The eigenpair continuation algorithm 

We are ready to describe the main algorithmic construct in this paper. When 
dealing with algorithms it will be more convenient to view the solution variety as 
the corresponding subset of C nxn x C x (C n \{0}), which, abusing notation, we still 
denote by V. 

Given two matrices B,Bq g S, B / ±Bq, let a := d§(Bo,B) € (0,7r) be the 
spherical distance (i.e. the angle) from Bq to B, and let 

£b 0 ,b = {B s : 0 < s < a} (16) 
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be the portion of the great circle in §, parametrized by arc-length, joining Bq and 
B, so B a = B. By abuse of notation, for any Aq, A G C nxn such that Aq, A are not 
M-linearly dependent, we simply write 

f Aq A \ 

d§(A 0 ,A ) := d§ ( , and Ca^.a '■= C a 0 a • 

The following neighborhood of the set of well-posed solution triples plays a distin¬ 
guished role in the continuation algorithm. Its definition uses a constant c* which 
we will later take to be 10~ 4 (cf. §5.1). 


Definition 2.17 The initial neighborhood of the set W is the set 
W := { {A, C, w) | 3(A, v ) s.t. (A, A, u) € W and dist j4 ((C, w ), (A, u)) <-- 777 }- 

The choice of c* implies that the hypothesis of Theorem 2.12 is satisfied and, 
hence, that ((,w) is an approximate eigenpair of A with associated eigenpair (X,v). 

Suppose that we are given an an initial triple (BqXq,wq) € W, Bq £ § and an 
input matrix B € § \ {±Bo}- Let Ao,wo G C x P(C n ) be as in Definition 2.17. As 
a consequence of the inverse function theorem, if (Cb 0 ,b \ {^ 0 }) H S = 0, then the 
map s t-A B s can be uniquely extended to a continuous map 


[0, a ]—* V, s >— {B s , X s , v s ), 


(17) 


We call this map the lifting of Cb 0 ,b with origin (Bq, Ao, ^o)- We can try to approx¬ 
imate the eigenpair (X a ,v a ) of B by following the lifting of Cb 0) b ■ To this end we 
can differentiate B s v s — X s v s = 0 w.r.t. s. This produces an Initial Value Problem 
(IVP) whose solution can be approximated by any standard numerical ODE solver. 
The main ingredient for the complexity estimate is the number of points in the 
discretization of [0, a] needed to approximate the solution of the IVP. 

Formalizing this idea to get an actual guarantee of convergence is a nontrivial 
task; only a non-constructive method has been described in [3] following the ideas 
in [40]. We now describe how to algorithmically construct a numerically stable 
method for this task: subdivide the interval [0, a] into subintervals with extremities 
at 0 = -so < «i < • • • < sk = a and successively compute approximations 
of (X Si ,v Si ), starting with ((o,wq) and then using Newton’s method. To ensure that 
these are good approximations, we actually want to ensure that for all i, (Q,Wi) is 
an approximate eigenpair of B Si+1 . Figure 1 attempts to convey the general idea. 
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Figure 1: The continuation of the solution path. 
The algorithmic counterpart of this idea is the following. 
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Algorithm 1 Path-follow 

Input: A, Ao £ C nxn \ {0}, and (Co> wo) € C x C n 
Preconditions: (A, Coi w o) € W, ||w’o|| = 1> MAo 7 ^ MA 

redefine A 0 := A 0 /||A 0 ||f, A x := A/\\A\\ f 
a := d§(Ao, Ai), so:=0, B:=Ao, i = 0 

repeat 

A := unit tangent vector (in the direction of the 
parametrization) to Ca 0 ,Ai at B 
As := Choose_step(B, A, (i, Wi) 

Sj_(_ 1 := min{a, Sj + As} 

B := point in Ca 0 ,Ai with d§(Ao,B) = s 

(Ci+i, Wi+i) := Wi) (three Newton iterations) 

if |Ci+i| > 1 then Ci+i = Ci+i/IC*+i|- 

i := i + 1 

until s = a 

return (C', w') = (||A|| F Ci, ™i) 

Output: ((>')€CxC n 

Postconditions: The execution halts if the lifting of Ca 0 ,a at (A, v ) 
doesn’t cut £'. In this case, (A,£',u/) £ VV and hence (C',u/) is an 
approximate eigenpair of A. 


Algorithm Path-follow is unambiguous except for the subroutine Choose_step, 
which will be described at the end of §5.2 below. We next state the main results for 
it. Recall, the point sp £ [0, a] is the value of s generated by Path-follow at the £t h 
iteration. 

Theorem 2.18 Suppose that Ca 0 ,a = {^s}se[o,a] (so Bq = Ao/||Ao||f and B a = 
A/||A||i?J and assume that £a 0 ,a H S = 0. Then the algorithm Path-follow stops 
after at most \K~\ steps where K is given by 

not. 

K := K(A,A 0 ,( 0 ,w 0 ) := C / p(B s , X s ,v s )\\(B s , \ s ,v s )\\ ds. 

Jo 

Here the pairs ( X s ,v s ) are given by (17). 

More generally, let q £ Z,q > 0. Then algorithm Path-follow stops after at most 
(the smallest integer greater than or equal to) 

not 

q + C p(B s ,X s ,v s )\\(B s ,X s ,v s )\\ds 

J S q 
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steps. The returned pair ((, w ) is an approximate eigenpair of A with associated 
eigenpair (X a ,v a ). Here, C < 3000 is a universal constant. 

For 0 < a < b < a, the quantity 


L n,a,b(.Bsi ^si^s) — / ^si Vs) || (B s , > Vs) II ds ( 18 ) 

J a 

in Theorem 2.18 is the length of the curve {(B s , X s ,v s ) : a < s < b} in the so- 
called condition metric. This is the metric that is obtained by pointwise multiplying 
the natural metric in § x C x P(C n ) by the condition number squared. We call 
L^ a ,b(B s ,X s ,v s ) the condition length of this curve. 

The proof of Theorem 2.18 is given in §5.3. 


Remark 2.19 From Proposition 2.7 , we have 

rot 

-^^.,5^,0;(-Bs? ^s? Vs) — j /i (B s , A s , Vs) ds. 

J Sq 

The following result gives an alternative bound to the number of steps. 


Corollary 2.20 Let Ao,A € C nxn be M-linearly independent and consider the 
path Af = (1 — t)Ao + tA which satisfies A\ = A. If £a 0 ,a H £ = 0 then, for any 
q € Z, q > 0, the algorithm Path-follow stops after at most \K~\ steps where 

K := q + V6 C\\Ao\\f\\Ai\\f f — -j ^ dt 

Jt q ll^tllF 


steps, with 


t , = _ I 1 A)IIf _ 

q \\A\ ||jr (sin ck cot s q — cos a ) + ||j4o||f 
Here, Xt, vt are the eigenvalue and eigenvector of A t defined by continuation. 


Proof. From Theorem 2.18 and Remark 2.19, letting C' = VbC, the number 
of steps is at most the smallest integer greater than or equal to 


q + C' f p 2 {B s ,X Bs ,v Ba )ds = q + C f p 2 (B s ,X Bs ,v Bs )\\B s \\ F ds, 

JSn JSn 


where X Ba and v Bs are the eigenvalue and eigenvector of B s defined by continuation. 
Now, reparametrizing that spherical segment by Ct = At/^\At ||_f> t € [0,1], the 
integral does not change. The quantity above is thus equal to 
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Substituting A t = A\ — Aq and A t = (1 — t)A$ + tA\ in this last formula and with 
some elementary computations (see [16, Lemma 17.5]) we conclude that 

d_ ( A t \ Pq||£ jjAijjg 

dt\\\A t \\ F ) ~ \\A t f F • 

The corollary follows. □ 

The inequality of Remark 2.19 implies that (up to constants) the upper bound for 
the number of steps by an algorithm in terms of the condition length as in Theorem 
2.18 is smaller than the upper bound in terms of the integral of the squared condition 
number as in Corollary 2.20. A similar situation applies in the context of polynomial 
system root finding. In this case implementations exist in both contexts see [45, 15, 
9, 20]. The condition length algorithm is more subtle and the proof of correctness 
more difficult, both for the polynomial system and eigenvalue, eigenvector cases. 
So the temptation is to present condition number squared algorithms. Here for the 
first time we present a quantitative estimate of the improvement which is significant 
and which justifies presenting the more complex condition length algorithm. In 
Theorem 2.28 a randomized algorithm is studied. The upper bound given by the 
condition length algorithm is 0(n 2 ) while an algorithgm with complexity given by 
the condition number squared would give 0 (n 3 ). 

In our main results we are interested in the cost of algorithms over random 
matrices A. The following quantity —the expected number of iterations of Path- 
follow for a given initial triple (Ao, Ao,uo)— becomes essential, 

Avg_Num_lter(A 0 , A 0 ,v 0 ) := E K(A,A 0 , A 0 ,u 0 ). 

A^ATf-nxn 

We can also consider the smoothed number of iterations of Path-follow that results 
by drawing instead the input matrix A from Mt{A,o 2 ) where A € 8 is arbitrary. 
We thus define 

Smd_Num_lter(A 0 , Ao, uo, a) := sup E K(A,A 0 , Ao,uo)- 

AeS A~J\f c nx„ T (A,a 2 ) 

Proposition 2.21 For Ao € C nxn we have 

Avg_NumJter(A 0 ,Ao,u 0 ) = O(n 4 /i 2 (A 0 , A 0 , v 0 )) 
and, for all o £ (0,1], 

Smd_NumJter(Ao,Ao,uo,u) = 0 ^V(A),A 0 ,u 0 ) y 
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Remark 2.22 Note that Proposition 2.21 ensures that Path-follow halts in finite 
time with probability 1 even if Aq E X, as long as h{Aq, Xo,vo) is finite. The 
algorithm does a first step that depends on h(Aq, Ao, vq) only and advances to B S1 
with si > 0. After that, the fact that the real codimension of X in C nxn is 2 (shown 
in Proposition 2.3) ensures that, almost surely, {Ca 0 ,a \ {A)}) H X = 0. Therefore, 
with probability 1, none of the matrices B s is in X and the integral over [s i, a] in 
Theorem 2.18 is finite. 

To compute all the eigenpairs from an initial matrix Aq and its eigenpairs 
(AW,vW),...,(AW,«W) we may proceed by following the n paths correspond¬ 
ing to taking these eigenpairs in the initial triples. In this case, we will be interested 
in the quantities 


n 

Avg_Num_lter_AII(A) : = Avg_Num Jter(A> X^\v^) 

i =1 


n 


= E 

A>^J\f^n X n 


J2k(A,A 0 , A«,t;«), 


and 

n 

Smd_Num_lter_AII(Ao, a) := sup E K(A, Aq v^). 

A£§A^J\f C nxn T (A,a 2 ) j =1 

For these quantities we prove the following result. 

Proposition 2.23 For Aq E C nxn we have 

Avg_NumJter_AII(A 0 ) = 0(™Vmax(A))) 


and, for all a E (0,1], 

Smd_Num_lter_AII(A 0 , a) = o(^-^ f^)- 

We prove Propositions 2.21 and 2.23 in Section 8. Note that the latter does not 
inmediately follow from the former since in general 

n 

^a 2 (A,a w ^ w )»aLx(A). 

i =1 

2.9 Initial triples and global algorithms 

The Path-follow routine assumes an initial triple (AAo,' u o) at hand. We next deal 
with this issue. We first consider the case of computing a single eigenpair. In this 
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case we consider the diagonal matrix H whose diagonal entries are (1,0,..., 0) and 
its well-posed eigenpair (l,ei). 


Algorithm 2 Single_Eigenpair 
Input: A € C nxn 


(C,w) := Path-follow(A, H, 1, ei) 

Output: (C^ w ) G C x C n 

Postconditions: The execution halts if the lifting of Ch,a at (l,ei) 
doesn’t cut £'. In this case, the returned (C ,w) is an approximate eigen¬ 
pair of A. 


We can formally state (and prove) the first of our main results. To this end, we 
define the average cost Avg_Cost(n) of Single_Eigenpair to be the average (over the 
input matrix A) of the number of arithmetic operations performed by the algorithm. 
We similarly define its smoothed cost Smd_Cost(n, a). 

Theorem 2.24 Algorithm Single_Eigenpair returns (almost surely) an approximate 
eigenpair of its input A € C nxn . Its average cost satisfies 

Avg_Cost (n) = 0(n'). 

For every 0 < <r < 1, its smoothed cost satisfies 

7 

Smd_Cost(n, a) = 

Proof. Lemma 2.10 and the fact that ||iL ||f = 1 imply that p(H, 1, ei) = 1. The 
statement is then a consequence of Proposition 2.21 and the fact that the average 
cost is obtained by multiplying Avg_NumJter(R, 1,ei) by the cost 0(n 3 ) of each 
iteration. □ 

Remark 2.25 The triple (H, 1, ei) is the version, in our context, of the initial pair 
proposed by Shub and Srnale in [45] for the computation of zeros of polynomial 
systems. In this later context, the problem of showing that one can efficiently follow 
linear homotopies with this initial pair remains open. 

The fact that any other eigenpair of H is ill-posed prevent us from using them 
to compute other eigenpairs of A. If we want to compute all the eigenpairs of A we 
will need to consider a different approach. 
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To do so, for any fixed n > 2, let 

D = diag(? 7 i,..., rj n ), 

where the ly are points in the unit side hexagonal lattice chosen in such an order 
that 0 = 1 771 1 < • • • < |? 7 n |. 


Lemma 2.26 We have y max {D) < n + o(n). 

Proof. The hexagonal lattice is the set of points of the form 


Q 


where Q = 


(J 


We first find a real number r > 0 with the property that the circle 0 ( 7 ’) of radius r 
contains at least n such lattice points. To do so, note that a lattice point Q(l) is in 
O(r) if and only if 

(f) G Q _ 1 B(r). 

Now, the singular values of Q _1 are \[2 and y / 2/3, so Q - 1 B(?’) is an ellipse of 
area 2nr 2 /\/3 and maximal radius y/2r. Dividing by the smallest integer N that is 
greater than 2 \/ 2 r and translating the resulting ellipse to have center ( 1 / 2 , 1 / 2 ), we 
look for points of the form ( a/N , b/N) with a, b £ {0,..., N — 1} which lie inside an 
ellipse of area 

2irr 2 

V3N 2 

contained in [0, l] 2 . This is a particular instance of the problem of counting lattice 
points in semialgebraic sets, a well studied problem for which a quite complete 
solution is for example [12, Th. 3, p. 327]. We conclude that the number of points 
in the hexagonal lattice in ID)(r) is at least 


27rr 2 

7T 


-2 N> 


2nr 2 

7T 


4\/2r - 2. 


In particular, we can find n lattice points in a circle of radius r = 3 1 / 4 /(27r) 1 ' /2 n 1 / 2 + 
c^n 1 / 2 ). Moreover, around each lattice point we can place a circle of radius 1/2 
without overlappings. It is a simple exercise to check that for any z € C and e > 0, 


< —j 
7 T£ z 


'\y-z\<E 


\y\ 2 dy. 


We thus have 


|o|I 2 f = £|%I 2 <£| 

j =1 * 


7T ^ 


j \y\ 2 dy<- \y\ 2 dy, 

\y-z\<l/2 71 J\y\<r+l/2 
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where z runs over all lattice points contained in B(r). Solving the last integral yields 


-D|||’ < 2r 4 + o(r 4 ) 


3n 2 , o. 

2 ? + o( " >• 


Finally, from Lemma 2.10 we conclude that 


M LxP) = l| J DHl<|^ + o(n 2 ). 


□ 

We now put together the continuation algorithm Path-follow and this specific 
initial triple. 


Algorithm 3 AILEigenpairs 

Input: A € C nxn 

compute D 

for j £ {1,. .., n} do 

(O’ w j) := Path-follow(A, D, rjj, ej) 

Output: {(Ci, wi), (C n,w n )} € (C x C n ) n 

Postconditions: The algorithm halts if Cd,a H S = 0. In this case, 
the pairs (C j,Wj) are approximate eigenpairs of A with pairwise different 
associated eigenpairs. 


We can now state (and prove) the second of our main results. 

Theorem 2.27 Algorithm AILEigenpairs returns (almost surely) n approximate 
eigenpairs of its input A £ C nxn , with pairwise different associate eigenpairs. Its 
average cost satisfies 

Avg_Cost (n) = 0(n 9 ). 

For every a < 1 its smoothed cost satisfies 

9 

Smd_Cost(n, a) = 

Proof. It easily follows from Lemma 2.26 and Proposition 2.23. □ 
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2.10 Randomized algorithms 

In this section we follow the ideas in [11] adapting them to the case of eigenvalue, 
eigenvector computations. Consider the probability distribution V in the solution 
variety V defined via the following procedure: 

randomly choose Aq Mcnxn 

randomly choose one eigenpair (Ao, vq) of do 

Next assume that we have a routine draw_from_D to draw triples (do,Ao,uo) from 
the distribution T> on V. Then we can consider the following algorithm. 


Algorithm 4 RandomJnitiaLtriple (scheme) 

Input: A € C nxn 

(do,Ao,uo) := draw_from_X> 

(C,w) := Path-follow(d, do, Ao, no) 

Output: ((,»)e(CxC n ) n 

Postconditions: The execution halts if the lifting of Ca 0 ,a at (Ao, vo) 
doesn’t cut S'. In this case, (C,w) is an approximate eigenpair of d. 


The interest of this algorithmic scheme is that we can prove better bounds (than 
those in Theorem 2.24) for the number of iterations of Path-follow. 

Theorem 2.28 The expected average number of homotopy steps of Path-follow in 
Algorithm 4 satisfies 

E E A(d,d 0 ,Ao,u 0 ) < 4Cn 2 , 

A^jV^nxn (Ao,Ao,^o)~® 

where C is as in Theorem 2.18. 

Theorem 2.28, which will be proved in Section 9, brings to focus the need of 
an implementation of draw_from_T>. It must be noted though that a direct imple¬ 
mentation is not possible since the second line in (19) (choosing (Ao, vq) at random) 
implicitly requires solving an EVP problem, the very question that this article is 
attempting to solve! This is a similar situation to that solved in [10, 11], where a 
random polynomial system and one of its zeros at random had to be chosen. It is 
also similar to that dealt with in [4] for the computation of eigenpairs of Hermitian 
matrices. A version of the proof of Theorem 2.28 with the Gaussian Unitary Ensem¬ 
ble replacing the Gaussian distribution, actually yields the following improvement 
over the main result in [4]. 
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Corollary 2.29 In the case of Hermitian matrices, the expected average number 
of homotopy steps of Path-follow in Algorithm 4 (with the randomization algorithm 
in [4] in place of draw_from_D) is 0(n 2 ). This yields an expected cost of 0(n 5 ) for 
the computation of one eigenpair and of 0(n 6 ) for the computation of all of them. 
Here the input matrix H is drawn from the Gaussian Unitary Ensemble. □ 

Following the ideas in those papers, we note that Theorem 2.28 would yield 
an algorithm with average running time 0(n 5 ) if we could find some collection of 
probability spaces fl n and functions ip n : Q n — >• V n , n > 2, such that: 

1. Choosing ui € Q n can be done starting with a number of random choices of 
numbers with the Ntr distribution, and performing some arithmetic operations 
on the results, the total expected running time being at most 0(n 5 ). 

2. Given u € Q n , V’n(w) is computable in average time 0{n 5 ), that is, the ex¬ 
pected number of arithmetic operations for computing ip n (uj) must be 0(n 5 ). 

3. Choosing u> at random in fl n and computing (Ho, Ao, vo) = ip n (w) is equivalent 
to choosing Aq ~ A/c«xn at random and choosing at random (Ao, vq) such that 
Avq = AoUo- That is, for any measurable mapping <j) : V n —>• [0, oo] we must 
have 



Unfortunately, we are not able to produce a collection of probability spaces Q n and 
functions ip n as described above. However, we will prove that relaxing (20) to the 
following less restrictive situation is actually possible: instead of demanding the 
equality in (20) we can just demand an inequality where the right-hand term is 
multiplied by some polynomial in n. Moreover, we do not need (20) to hold for 
every measurable function <f> since all the interesting functions for the EVP problem 
are projective functions, invariant under the action of the unitary group. We can 
thus relax (20) to hold only with a polynomially bounded inequality, and for unitary 
invariant projective functions. Proving that this can actually be done is our goal 
now. 

We start by defining Q n and ^ n . Consider the classical Stiefel manifold consisting 
of orthonormal (n — l)-frames in C n , given by 

<Sn-i(C n ) = {Q e C nx ("- 1 ) : Q*Q = I n _{}, 
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endowed with its natural probability measure given by the restriction of the Frobe- 
nius Hermitian structure to the tangent bundle. 

For every n > 2, let 

An ■■= {( M,Q ) : ker(M) = ker(Q*)} C C^ xn x S n _i(C n ). 

In other words, A n consists of pairs of matrices M, Q such that the columns of Q 
form an orthogonal basis of the complement of ker(Af). The set A n has a natural 
probability measure p_g n given by 


PA n (X) := E 

M ^r cl n-l)xr 


vol (Q : (. M,Q ) € An) jQ:(M,Q)£Ar 


l x (M,Q)dQ 


for measurable sets X C A n . 


Definition 2.30 Let 

tt n := {(A,u>, (M,Q)) : 2K(Atr(AfQ)) < 1 - |A| 2 (n - 1)} C C x C"" 1 x A n 

be endowed with the product measure nn n , normalized to have total unit mass (see 
Remark 2.31 below). Then, let 


= (() MQ + \I n _i) ' A ’ ei ) ' 

Remark 2.31 An explicit description of the product measure is as follows: 

MO n (Y ) := C n E : (A ,w,M,Q) € Y})) , 

A ,w 

for measurable sets Y C where A A/c, w A/^nxi and C n is a normalizing 

constant given by 

C- 1 = Prob (2»(Atr(MQ)) < 1 - |A| 2 (n - 1)) . (21) 

A, 1VI , 

Our last main result (see Section 10 for a proof) is the following. 

Theorem 2.32 Let f 1 n and yj n be as in Definition 2.30 for all n > 2. Then: 

1. Choosing uj € Q n can be done by choosing 2 n 2 — 2n + 1 numbers with the 
Nq distribution, checking a test which involves the computation of a Moore- 
Penrose inverse and computing two QR decompositions. This process may be 
repeated as a function of the test’s outcome, but the expectation of the number 
of times the test is performed is at most C n . The total expected running time 
is 0(n 3 C n ). 
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2. Given co € Q n , computing ip n (uj) can be done with running time 0(n 3 ). 

3. For any unitarily invariant measurable mapping <f> : V —>• [0, oo] we have: 


E (<t>(ip n (u))) < enC n E 

Ao~Af C nxn 


^ <KA)) ^0) W 0) 

\o,vq:Aovo=\ovo 


( 22 ) 


Note that (22) can be understood as follows: let m i be the push-forward measure 
of ij} n in V and let to 2 be the measure in V given by 

m 2 {X) = E ( —tt{(Ao,no) : Aouo = Aoto, (Ao, Ao,no) G A'} ) , 

Ao~A/’ c nxn \ n / 

for any measurable set X C V. Then, the Radon-Nikodim derivative dm\/dm 2 is 
bounded above by enC n . 

Problem 2.33 Describe an alternative collection (Q n ,tp n ) which satisfies a sharper 
version of (22), with a constant in the place of nC n . This would improve the running 
time of Algorithm RandomJnitiaLtriple below by a factor of 0(nC n ). 

We are now prepared to describe our random homotopy algorithm. 


Algorithm 5 RandomJnitiaLtriple 
Input: A G C nxn 

Randomly choose u> € 

(A 0 ,A 0 ,no) := ip n {u) (note that n 0 = ei) 

(C ,w) := Path-follow(A, Ao, Ao, no) 

Output: ((,w) € (C x C n ) n 

Postconditions: The execution halts if the lifting of £a 0 ,a at (Ao, vo) 
doesn’t cut S'. In this case, ((,w) is an approximate eigenpair of A. 


From Theorem 2.32, the expected running time of the computation of (Ao, Ao, no) 
is 0(n 3 C n ). Moreover, the expected number of homotopy steps in the execution of 
Path-follow(A, Aq, Ao,no) is 


5= E (K(A,i/> n (u)))= E 


A^Jf, 


E (K(A,ip n ( u>))) 
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where K(A, r ip n (uj)) is as in Theorem 2.18. From (22), we have 


S < 


enC n E 

Ao~J\f cn xn 


- E (K(A,A 0 ,Xo,vo)) 

\ \0,V0'-AqV 0=\0V0 ■N’c nxn 


< 0(n 3 C n ). 

Th. 2.28 

We multiply the number of steps by 0(n 3 ) to get the following complexity bound. 


Theorem 2.34 Algorithm RandomJnitiaLtriple returns (almost surely) an approx¬ 
imate eigenpair of its input A € C nxn . Its average cost satisfies 

Avg.Cost (n) = 0(n 6 C n ) < 0(n 7 ). □ 

Lemma 10.1 


3 Some properties of the condition number /i 

There is a general geometric framework for defining condition numbers, see [16, 
§14.3]. In our situation, this framework takes the following form. 

If (A, A, v) € W, then from the inverse function theorem the projection tt\ : V —>• 
C nxn (cf. (3)), around (A, X,v), has a local inverse (A, G*(A)), that is 

defined on an open neighborhood U of A in C nxn . We call G the solution map. The 
map G decomposes as G = (G\, G v ), where 

Gx'M^C and G v :U -4- P(C n ) 

associate to matrices B £ U an eigenvalue and the corresponding eigenvector. Let 

DG X (A) : C nxn -> C and DG V (A) : C nxn -A v L 

be the derivatives of these maps at A. The condition numbers for the eigenvalue A 
and the eigenvector v of A are defined as follows: 

H\(A,X,v) :=\\DGx(A)\\ and p v (A, \,v) := \\DG V (A)\\, 

where the norms are the operator norms with respect to the chosen norms (on C nxn 
we use the Frobenius norm and on v 3 - the norm given by (1)). 

The following result, Lemma 14.17 in [16], gives explicit descriptions of DG\ 
and DG V . Before stating it, we recall that if A is an eigenvalue of A there exists 
u € P(C n ) (the left eigenvector) such that (A — Ald)*u = 0. Recall the linear map 
A\ jV : v 3 - -» v 1 - introduced in (4). 

Lemma 3.1 Assume that Av = Xv and A has multiplicity 1. Then, the associated 
left eigenvector is 

u = v-icnA~* v P v± A*v. (23) 

Here we denoted A)(* v := (Aj)*)*. Note that (u,v) = ||u|| 2 . 
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Proof. Take a representative such that ||u|| = 1 and let 

? := icnA^* v P v ±A*v = ((Id — vv*)(A — Aid)*(Id — vv*))^A*v. 

’ (6) 

From the definition of the Moore-Penrose pseudoinverse, z is the unique element in 
v 1 - that minimizes ||(ld — vv*)(A — Aid)*;? — A*u||, that is we have (Id — vv*)(A — 
Aid)*?; = P v ±(A*v ) = (Id — vv*)A*v or equivalently 

(A — Aid)*? = A*v + tv for some t € C. 

Premultiplying both sides by v* we have 

v*(A - Aid)*? = v*A*v + f||u|| 2 => 0 = (Xv)*v + t||u|| 2 = (A + f)||u|| 2 , 

so we have t = — A and then 


(A - Ald)*(u -z) = {A- Ald)*u - A*v + Xv = 0, 


that is, v — z is a left eigenvector of A with associated (left) eigenvalue A as wanted. 

□ 

The following is Lemma 14.17 in [16]. 


Lemma 3.2 Let ( A , A, v) € VV and let u be a left eigenvector of A with eigenvalue A. 
Then: 

(a) We have (v , u) / 0. 

(b) The derivative of the solution map is given by DG(A)(A) = (X,v), where 


A 


(Av, u) 
(v,u) 


= ~K P v^ 


Av. 


□ 


The following result, which follows directly from Lemma 3.2, was already pointed 
out in [52] (see also [16, Prop. 14.15]). 


Proposition 3.3 Choosing the Frobenius norm on T 4 <C nxn = C nxn and jAr || • || on 
v^, the condition numbers p v for the eigenvector problem and p\ for the eigenvalue 
problem satisfy: 

= II0CaM)II = ]|j|[ (5) ]^[[ < VI + mHA, a, v) 

and 

MA X,v) = \\DG V (A)\\ = ||A^|| = □ 

P 1 F 
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Proof of Proposition 2.7. The first two inequalities are immediate from 
Proposition 3.3. For the third one, note that 


iif mii = 1100,011 < ioii\/i+f‘?+(i+o) < ii aii 

the last inequality since nt > (Lemma 2.6). □ 

Our last lemma is a version of Lemma 2.6 without the assumption that our point 
lies on W. 

Lemma 3.4 For A € C nxn , w € C n and ( £ C with |£| < ||^4||_f we have 

1 




1 

> - 

ii /1 _ ik»4|| 2 2 

+ w WPIf 


Proof. W.l.o.g we can assume that w = e\ and write 

where A £ C and a, b £ C n ~ 1 . Then, A^ w = 4 — £ld n _i and 

1 


M (A,C,ei) =||(j 4_ C | dn _ l) -i| | > 1 


> 


ha — cid„_iii mii + ici 


> 


F - \\ e l A \\ 2 + PIIf 


The lemma follows. 


□ 


4 Proofs of Theorem 2.8 and Corollary 2.9 

It will be handy to use the definition of /i given in (10). We start with a very simple 
linear algebra lemma about the Moore-Penrose pseudoinverse. 


Lemma 4.1 Let R, R' € C nxn be such that R has rank n — 1. Assume moreover 
that det(R') = 0 and 


R-R ’|| < 


IW 


Then, R! has rank n — 1 and 


for some 0 < e < 1. 


M 

l + £ 


< ii^ii < 


11^*11 
1 — £ 
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Proof. Let a and a' be the (n — l)th singular values of R and R' , respectively. 
Note that a — ||i? — R '|| < a' < a + ||R — i?'|| (this is a classical fact proved for the 
first time in [54], see also [26, Cor. 8.6.2]). In particular, a' > a — \\R — i?'|| = 
||i?t||-i _ \\R-R’\\ > 0, so R' has rank at least n— 1 and by hypothesis it has rank 
n — 1. Moreover, we have 

l|B , ll = - = — < l|i? V + " fi - g " < (1 +e)l|B' t l|. 

a a a a 

The upper bound follows from a similar argument. □ 

Proof of Theorem 2.8. Choose representatives such that ||n|| = ||n'|| = 1 
and let 


Q := (Id - vv*){A - Ald)(ld - vv*), Q' := (Id - v'v'*){A' - A'ld)(ld - 

We have rank(Q) = n — 1 since [i(A, X,v) < oo by our assumption, cf. 
claim that 


IIQ - Q'll < 


iiQ f ir 


which from Lemma 4.1 implies that Q' has rank n — 1 and 


v'v'*). 
(10). We 
(24) 


\m 

1 + s 


< IIQ /f ll < 


\m 

1 — e’ 


that is (recall ||^4 ||f = H^Hf = 1), 




1 + £ 


1 — £ 


as wanted. 

It remains to prove the claim. To do so, let (At, At, vt) be a geodesic in C nxn x C x 
P(C n ), parametrized by arc-length, joining (A, A, v) and (A!, A', v'), so by hypothesis 
we have t 6 [0, e/(4\/3||Qt||)] and we can choose representatives vt in such a way 
that 


vt±vt,, ||n t || = 1, 11At||| + |A t | 2 + ||nt|| 2 = 1, for all t G [0,£/(4\/3||Q t ||)]- 

Let Qt := (Id — vtvl)(A t — At Id) (Id — vtvl). In order to bound ||Qt|| we first note 
that for x € C n , 


(v t v* t +v t v* t )(x) || =\\v t (x,v t ) +v t (x,v t )\\ = V\\vt\\ 2 \(x,v t )\ 2 + ||nt|| 2 |(x,ht)| 2 

\x\\, 


= \\v t \\\\(x,v t )\' + 


(x -£*-) 

{ ' u*r 


< 
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that is || Vtv* +VtV *|| < \\vt\\- On the other hand, 

11-At — At Id || < ||A(|| + | At | < H-Atll + ||At|| < 2 ||t4^||f = 2, 
so computing the derivative of Qt we see that 

\\Qt\\ < 2|| v t v* + vtv*\\\\A t - At Id || + || A t - At Id || 

< 4||ht|| + \\A t - Atld|| < 4(11^11 + ||i*|| + |A t |) 

< 4\/3(||h t || 2 + ||i t || 2 + |At| 2 ) 1/2 = 4>/3. 

Thus, we have ||Q — Q'\\ < jW( 4 v / 3||Q t H) 4 ^/ 3 ^ < 4 y / 3 £/( 4 v / 3 ||Qt||) j finishing the 
proof of (24) and that of Theorem 2.8. □ 


Proof of Corollary 2.9. Consider the portion La,A' = {At} of great circle 
defined for t € [0, a] where a = d&(A, A'). Let (A, v) be any eigenpair of A. From the 
inverse function theorem, the map t eA T(t) = (A t ,X t ,vt) € W (given by the local 
inverse of 7Ti) is defined and, denoting dist(t) := dist((A, A, v), (A t , X t , vt)), satisfies 


dist(t) < 


e 

4\/3 n(A, A, v) ’ 


t G [0,(5) 


for some maximal <5 G (0,a]. From Theorem 2.8 this implies 


M r W) < (1 + £)»(A,\,v), t, G [0, (5), 


(25) 


(26) 


giving a global bound for the norm of the derivative of the solution map. Thus, T(t) 
can be continuously extended to [0, (5] and from maximality of 5, either <5 = a or 
5 < a and (25) must become an equality when changing t to 5. In this last case, 
which we will rule out by contradiction, note that from Proposition 2.7 and (26) 

dist(<5) < / ||r(t)||di < \/6 / dt < \/6(l + e)Sfi(A, A, v). 

Jo Jo 

Now, in our range of values we have a = d§{A,A!) = 2arcsin(||yl — A'\\p/2) < 
|[y[4||A — A'\\p which, together with 5 < a, implies 


'1001\\A-A'\\ F y/E(l + £)n(A,\,v)^100l£y/6(l + e) ^ £ 

' St( 1000 - 50000/x(yl, A, v) < Ay/3fi{A, A, v) ’ 

which is a contradiction (note that we have used the bound || A — A!\\p ^ 2 {A, A, v) < 
e/50). We thus conclude that 5 = a and the corollary follows. □ 
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5 Condition-length homotopy continuation 


The goal of this section is to fully describe the routine Choose_step (with which 
algorithm Path-follow will be complete) and to prove Theorem 2.18. 

To do so, it will be useful to denote by (3(A,(,w) the length (in the tangent 
space) of the Newton step with matrix A G § and input (C,w) £Cx IP(C n ). That 
is, if we take a representative such that ||u/|| = 1, 


P(A, (,w) 2 


(- df a((,w)\ C xw ± 


) 1 Fa((,w) 



|A| 2 + |H| 2 , 


where A ,v are given in (11). When ||u|| is small (5 approximates the length of the 
Newton step also under the distance dist. Indeed, 


dist A ((C ,uj),(N a ((,w ))) 2 = | A| 2 + dp(w, w — v ) 2 

A| 2 + (arccos ~ 

\ ||u;|| ||rc — u|| / 


( 2 ) 


= |A| 7 + ^arccos 


yr 


It is easy to check that 

_9_ 

10 


/ 1 X 2 

x 2 < ( arccos , ) < x 2 for x < A, 

- V yTTa?' ~ ~ 3 


so we have proved that whenever ||u|| < 1/3, 
9 


10 


/3{A, C, w) < dist A ((C, w), {N a ( C, w))) < /3(A, C, w). 


(27) 


(The upper bound inequality is valid regardless of the value of ||u||). The knowledge 
of /3 allows us to bound the distance from an approximate eigenpair to the associated 
exact eigenpair. 


Lemma 5.1 Assume that (Ci w ) JS an approximate eigenpair of A € § with as¬ 
sociated eigenpair (X,v). Then, dist^(((), w), (A, u)) < 2f3(A, (, w). Moreover, if 
(3(A,(,w) < 1/3 we also have 

C , w ) < disU((C, w), (A, v)) < 2 ft (A, (, w). 

Proof. We have 

dist A ((C, w), (A, v)) < dist A ((C, w), {N a ( C, w))) + dist : A ((N A ((, to)), (A, v)) 

< f3(A, C, w) + idist v4 ((C, w), (A, v)), 

(27) 2 
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the last from the definition of approximate eigenpair. The upper bound in the 
statement follows. Using a similar argument, 

dist A ((C, w), (A, v)) > dist A ((C, w), {N A (( , w))) - dist A ((IV A (C, w)), (A, v)) 

> C, w) - ^dist A ((C, w), (A, v)), 

(27) 10 1 

and the lower bound follows as well. □ 


5.1 The Lipschitz property of f3 


To describe Choose_step we need a set of constants satisfying a few relations. 
Not all of them are used in the description of Choose_step. Some only oc¬ 
cur in the proof of the correctness of Path-follow. We will consider constants 
c i> c i> c ui c u> c *j c 4 > c 5 > c 6 > c 7i K- These are any collection of positive numbers sat¬ 
isfying the following: 


Vs < ci < 
4\/3c* < 1, 


1 


< |c?(V3-l) 

v 3 c„ < c u - 


2 u ~ 1 - 3ci 

C 4 = c* + (1 + 4\/3c*)(ci + 2cu), 4 \/ 3 c 4 < 1, 


1 


2(1+4v / 3c*)(1+4v / 3c 4 )c u < Kc * < -, c 5 = -^— 

5 (1 + 4y3c*) 


-2 


2c* + (1 + 4v / 3 c*) 

(1 - 3ci) : 


C 6 = 


c 5 (l — 3ci) — 2(1 + 3ci)c* 
2(1 + 3ci)(1 + 4\/3c4) ’ 


C 7 = mm 


(1 + 4x/3c 4 )(1 + 4V3c*) 


> ce 


A collection of parameters satisfying these constraints is shown in Table 1. 


C 1 

10" a 

C 4 

0.005306... 

Cl 

V3c' 

C5 

0.00099... 

d 

U U 

10" 3 

C6 

0.00038... 

ClL 

,/5J | 3c i(%/3—1) 

VUC u + 2(1—3cl) 

K 

64 

c* 

10 -4 

C7 

0.00038... 


Table 1: Our choice of constants for Path-follow 


In addition to these constants, for the sake of clarity, we spell out a working 
hypothesis that we will repeatedly use. 

Hypothesis 5.2 We have (A,X,v) € V, ||A||,p = 1, and (C,w) an approximate 
eigenpair of A satisfying |£| < 1 and 

p(A, A, u)dist A ((A, v), ((, w)) < c*. 

Also, 6: [0,7r) —§, 9(s) = A s is some arc-length parametrized half-great circle with 
Aq = A. 
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The main goal of this section is to prove the following result. 


Lemma 5.3 Under Hypothesis 5.2, let s < ci/p(Aq, (, w) where 0 < ci < 1/3. Let 

$ = 


Then, 

where 


(DFa 0 ((,w)\ C xw ±) 1 A 0 w 
P~(s) < P(A S , (,w) < f3 + (s), 


a-,, s<L - /3(A 0 ,C,w) - ±c(/p(A 0 ,(,w) 
p w =- T+Uci -' 

For the proof we need some stepping stones. 


a+ , , + /3{A 0 ,(,w) + ±c(/p(A 0 ,(,w) 

13 {s) = -- 


Lemma 5.4 Let A G S, £ £ C, |£| < 1, w G C n , ||u>|| = 1. Then, 

(DF a ((,w) Icx^r 1 <3 n(A,(,w). 

Proof. This result is similar to [3, Proposition 6.6], but our assumptions are 
weaker and our definition of condition number is slightly different (see the proof of 
Theorem 2.12). We can assume that w = ei for the proof and write 


A = 


A a* 

b A) ’ 


where A G C and a, b G C n P Then, for £ G C and w G C 


n— 1 


DF A (C, ei ) C, 


We thus have 


(■ DF a ((,w )\ Cxw ±) 


-l 


< 


= (A- Cld)( ° ) -Cei 

1 w 


A-C cl* WO' 
b A — Cldri —1 y \w 

-1 a> WC 
0 A-(;\d n -iJ\w 


-1 a* 

0 A-Cld„_i 

-1 a* (A C Idn. —i ) 1 
0 (A- Cldn-l)" 1 

0 0 

0 (i- Cldn-l)- 1 


- C e i 


+ 


-1 a*(A- Cldn-l)- 1 

0 0 


— II + w + ll a ll 2 ll A 


1211 A-l 112 


C.M ' 1 


= M A, C, w) + \J\ + \\a\\ 2 p(A,C,w) 2 , 
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the last two claims in this chain by definition of A^ w and the fact that ||^4 ||f = 1 - 
Now, from Lemma 3.4 we know that n(A, (, w) > (1 + \J 1 — ||a || 2 ) -1 which implies 



the last inequality due to 0 < ||a|| < ||^4 ||f = 1- We have proved that 


{DF a ((,w)\ c> 


< v{A, C,w) + 


\a\\ 2 n(A,C,w) 2 < 3n(A,C,w) 


and the bound claimed in the lemma follows. 


□ 


Lemma 5.5 Under Hypothesis 5.2, let s > 0 satisfy 3h(Aq,(,w) s < 1. Then, 
(F)F As (£, w)|cxui- L ) 1 (DF Ao (C,w)\ C xw ±) < 1 _ g 

and 

(DF Ao (C,w)\ Cxw ±y l ( DF Aa ((,w )\ Cxw ±) < 1 + 3 p(A 0 ,C,w)s. 

Proof. Note that 

DF Ao ((,w)\ Cxw ±(i),x) = DF As (C,w)\ Cxw ±(fi,x ) + (j4 0 - A a )x, 

hence 

Id- (i?F.Ao(C,w)| Cxt£ ,±) _1 (FF As (C,w)lcx«+) 

< (DF Ao ((,w)\ Cxw ±y L P s -Ao|| < 3p(A 0 ,(,w)s 

Lemma 5.4 

where we used that ||.A S — x4o|| < ||— ^4 o||f A s. The second claim in the statement 
is now obvious and the first one follows from the Banach lemma, ||ld + A II < (1- 
||A||) 1 , valid for ||A|| <1. □ 

Lemma 5.6 Under Hypothesis 5.2, let s > 0 satisfy 3fr(AoX,w) s < 1 and let 
^aux(s) = (T>Fao(C,w)Icx«;J-) 1 (A) ~ A a )w . 

Then, 

$aux(» - fl(Ao,C, w) ^ , $aux(+) + P(A 0 , (, W ) 

1 + 3 p(A 0 ,C,w)s sA ’ W> - 1 -3ii(A 0 ,C,w)s ■ 
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PROOF. From the definition, 

P(A s ,(,w) = ( j DPa s (C,^)Icx«;-l) 1 F As ((,w) 

= {DFAsiC,™)^^) -1 (DF Ao ((,w)\ C xw ±) (28) 

(DF Ao (C.wOIcxup) -1 ( f a 0 (C,w) + (. A s -A 0 )w ) . 

Thus, 

P(A a ,C,w) < (T>Pa s (C,^)Icx^) _ 1 (- Di? A 0 (C,w)lcx«;-L) (^aux(s) + P(A 0 ,(,w)). 
Similarly, one shows 

D >- W»)-gQ.,C.-) - 

{F>F Ao ((,w)\ C xw ±) (DF As (C,w)\ C xw ±) . 

The statement now follows from Lemma 5.5. □ 

Proof of Lemma 5.3. We claim that we can write 

s 2 

A s = Aq + sAo + ~-B, 

for some B with ||S||f < 1- Indeed, this is an elementary observation which follows 
from the fact that A s is a great circle in the sphere (w.l.o.g. one can choose A s to 
be the circle parametrized by (cos s, sins, 0 ..., 0) to prove it). As a consequence we 
have 

|^aux(s) - S<5 >| < (DF Aq (£, rOIcxuj- 1 -) ~ 1 (A) - A s + sA 0 )w 

<4- {DF Aq (Ci' w )Icxju- l ) 1 < ^s 2 p(A 0 ,(,w). 

^ Lemma 5.4 ^ 

The upper and lower bounds for P(A S , (, w) in Lemma 5.3 now follow from this last 
estimate and Lemma 5.6. □ 

5.2 The step’s length in the homotopy continuation 

The following result is crucial for the understanding of the homotopy algorithm. Its 
proof follows a logic which is similar to that of the proof of Corollary 2.9. 

Proposition 5.7 Under Hypothesis 5.2, let s' > 0 be any number such that 

c[ < p(A, (, w)s' < ci 
and let s" be any number such that 

c'u < p(A,(,w)/3 + (s") < c u , 

where fd + is as in Lemma 5.3. Let s = min(s / , s"). Then, 
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I. The (continuous) branch of the solution map 7T 1 1 o 6 : [0, s] — >• V with 7r 1 1 o 
0(0) = (A,X,v) is well defined. 

II. For every s € [0, s], let (A,, X s , v s ) := 7T]) 1 o 8(s). Then, 

- - 1 p(A, X, v) < p(A s , X s ,v s ) < (1 + 4V3c4)p(A, A, v) (29) 

1 + 4y3c4 

and 


dist 4 s ((A s ,v s ),((,w)) < 


Kc * 

p(A s , X s , u s ) 


(30) 


In particular, (£, w) is an approximate eigenpair of A s with associated eigenpair 
(A s , f s ) for every s € [ 0 , s]. 

Finally, the condition length L fi0t s(7r(( 1 (A s )) (see (18 )) of the curve n f 1 (A s ) is 
at least c-j. 


Before we prove Proposition 5.7 we make a few comments on how the proof of the 
proposition and Theorem 2.18 proceed. The hypotheses fd + (s") give us a bound on 
the distance from (C,w) to (X s ,v s ) and ultimately (A, a) to (X s ,v s ). Together with 
the bound on s' this allows us to invoke Theorem 2.12 and Theorem 2.18 to prove 
conclussions I and II of the proposition. The tricky part will be to prove the last 
statement that L ll ^ t s(TT(( 1 (A s )) > c?. This last statement gives us the upper bound 
on the number of steps in Theorem 2.18 as the condidion length divided by oj. Now 
to see that 


Vs) || Vs) || d>S — 0,s(^"i (-^s)) ^ ^7 


it will suffice to prove that 



||(i s , A s ,u s )|| ds > 


constant 
F(A, X,v)’ 


since p is almost constant on the interval. For this it will suffice to prove that one 
of 


11 -As 11 ds or 


It’sII ds 


is greater than constant / p(A, X,v). The first integral is s. We will see that if s is 
small then dist^Us, v) is greater than or equal to some constant over p(A, A, v), and 
so is the integral of ||u s || which is the length of a path between v and v s . 


Proof of Proposition 5.7. We prove the first part of the proposition. 

From the inverse function theorem and the continuity of p, there exists a maximal 
s* < s such that I and II hold changing [0, s] to [0, s*). The global upper bound for 
p(-K]) 1 (A s )) shown in (29) is in turn an upper bound for the derivative of the solution 
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map, for s€ [0, s*). A standard limit argument in compact sets then implies that 
7rj" 1 o 0 can be extended in a continuous manner to [0, s*], and because (29) and (30) 
are open conditions we must have one of the two following scenarios: 

i) s* = s and both (29) and (30) hold changing s to s*, or 

ii) At least one of (29) and (30) does not hold changing s to s*. 

We now discard the second option. Note that from Hypotheses 5.2 and Theorem 2.8, 

-- . 1 ns A ’ v ) < MA C, w ) < (! + 4 V3c*)n(A, A, v). (31) 

1 + 4V3 c* 

Then, we have for every s € [0, s*] 


dist ((A,C,w),(A s ,C,w)) = \\A- A s \\ f < s* < s' < 


and because s* < s", from Lemma 5.3 we have 


ci 


(1 + 4\/3c*)ci 
n(A,(,w) ~ n(A, X,v) 


< 


P(A s ,C,w ) < /3 + (s) < 


(1 + 4y/3 c*)c u 
n(AX,w) " K A ,X,v) 


< 


From Lemma 5.1 (recall that from II and Theorem 2.12, ((,w) is an approximate 
eigenpair of A s with associated eigenpair (X s ,v s ) for s < s*), this last inequality 
implies 

d\st As {((,w),(\ s ,v s )) < 2 P(A S , C,w) < 2<yl + 4v ^ c ^ Cu . (32) 


li(A,X,v) 


We have thus proved that for every s € [0, s*], 


dist((A, A, v), (A s , A s , v s )) < disU((A, v), (C, w)) + dist((A, (, w), (A s , (, w)) 

+ dist j4s ((C, w), (X s ,v s )) 

c * (1 + 4\/3c*)ci 2(1 + 4\/3 c*)c u 

+ : , : : T 


< 


n(A,X,v) n(A,X,v) 
c 4 


K A , X,v) 


n(A,X,v)' 

Then, from Theorem 2.8 (note the strict inequality in the displayed formula above), 

-- 1 j= n{A, A, v) </i(A s *, X s *,v s *) < (1 + 4v / 3c 4 )/x(A, X,v). (33) 

1 + 4V3c 4 

Thus, (29) holds at s* and moreover 


distA a .((A s .,u a .),(C,u>)) , < , 

(32),(33) 


2(1 + 4\/3c*)(l + 4\/3 c 4 )c u 
KA S ., X s *,v s *) 


< 


K c* 


V{A S *, X s *,v s *)' 
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That is, (30) holds at s* and we can discard option ii), proving the first part of the 
proposition. 

For the last claim of the proposition, note that the condition length of the curve 
vr" 1 ^) is 


(A s )) — 


> 

(29) 


h(tt 1 1 (A s )) Dir 1 l (A s )A, 


.- 1 / 


Jo 

MM 
1 + 4\/3c4 



Dtt 1 


ds 

ds. 


If s = s' then using that the integrand is greater than or equal to 1 and (31) we 
have 


(^s)) — 


/j(A,\,v) ,. ci n{A,X,v) 


-s’ > 


> 


c'l 


1 + 4\/3c4 ~ 1 + 4\/3c4 fi(A, c, w) (1 + 4^c 4 )(1 + 4v / 3c*)' 

Assume now that s = s". Then we have 

n(A,\,v) 


i (As)) ^ 


> 


1 + 4\/3c4 Jo 
n(A,\,v) 


DA 1 (A) A 


ds 


1 + 4\/3c4 

Now, note that (recall (Aq,Ao,vo) = (A, A,v)) 


dist((A 0 , A 0 , v 0 ), (As, A f , Vs)). 


dist((A 0 , A 0 , v 0 ), (Ag, As, v 5 )) 

> dist((A 0 ,C,iv), (As, As, v 5 )) - dist j4o ((A 0 , v 0 ), ((,w)) 

> dist As -((C, w), (As, v f )) - /* ■ 

M(A,A,v) 

We need a lower bound for this last term. We first note that from Lemma 5.1 and 
Hypothesis 5.2, 


/5(A, C, iv) < . 2C * . < 2(1 + 4v ^ C * )c * 


M(A,A,v) h(A,(,w) 
the last by (31). Using this last bound and, again, Lemmas 5.1 and 5.3, we have 

2 1 + 3Cl dist As -((C w), (As-, Vs)) > ! + !! C1 /3 (As-,C,A 


(34) 


1 - 3ci 


1 - 3ci 


^l + 3ci 0 _,_^ P(A,£,w) + fcfA(A,C,v;) 

> -— p (s) = pr( 8 ) - 2- 


1 - 3ci 


> 


n(A,(,w) 


- 2 


1 - 3ci 

f3(A,(,w) + |cf /n(A,(,w) 


> 


-2 


1 - 3ci 

2c* + |cf (1 + 4\/3c*) 


C5 


(1 + 4v / 3c*)/x(A,A,v) (1 - 3ci)/x(A, A,v) /r(A,A,v)' 
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For the inequality in the third line we used the assumption in the statement and 
s = s" . For the inequality in the fourth line, we used (31) and (34). For the equality 
in the fourth line we used the definition of C 5 . 

We have thus shown that if s = s” , then 

J {-if AW > MAA,u) ( c 5 (l — 3ci) _ c* 

1 1 s)) ~ 1 + 4V3c 4 \2(l + 3c 1 ){i(A,X,v) n{A,\,v) 

c 5 (l - 3ci) - 2(1 + 3ci)c* 

= - 1 = - = c 6 • 

2(1 + 3ci)(l + 4 -v/3c 4 ) 

Since C 7 < cq the proof is complete. □ 

We can finally describe the subroutine Choose_step. It is important to note that 
given any matrix A € C nxn , one can compute in 0(n 3 ) arithmetic operations, for 
example by first reducing A to tridiagonal Hessenberg form and then using the main 
result of [28], a number r such that ||A|| < r < \/3||A||. That is, we can compute 
operator norms within a factor of y/3 and, consequently, we can compute /i within 
a factor of \/3- 

Algorithm 6 Choose_step 

Input: S, and (C,w) € C x C n , ||rc|| = 1 


compute r>0 such that {i(B, £, w) < r < w) 

s' := ci/r 


$ : = 


( DF b ((,w)\ C xw ± ) 1 Aw 


compute s", the solution of the following linear equation 


<F s" + /3{B, C, w) + |c|\/3/r _ 

1 — 3ci r 


s := min(.s / , s") 

Output: As = s G [0,7r] 

The step size computed by Choose_step cannot be too small, as we show now. 

Proposition 5.8 The value As returned by Choose_step(5, A, £, w) satisfies 

. R 

As > - 

with R = 07 ( 6(1 + 4 \/ 3 c 4) 2 ) -1 > 0 . 
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5.3 Proof of Theorem 2.18 and Proposition 5.8 

We now prove Theorem 2.18, and the proof of Proposition 5.8 will follow straigh- 
forward from our arguments. 

From the definition of Path-follow it is clear that we can assume that ||Ao||_f = 
11 A | = 1. We further assume that the constants c\ , ci, c' u , c u c*, C4, C5 , eg , c-j and K 

take the values in Table 1 and denote by tti 1 (Ca 0 ,a) the lift of £a 0 ,a with origin 
(Aq, Xo,vq). Note that this lift is well-defined since, by hypothesis, £a 0 ,a H £ = 0. 

Let Bi be the matrix B at the beginning of the ith iteration of Path-follow. Also, 
let (A i,V{) be such that (_£>*, Aj, is the (only) triple in t:^ 1 (Ca 0 ,a) above B ;. 

We first prove that for all i > 0, (Cii w i) is an approximate zero of Bi with 
associated eigenpair (A i,vt) and satisfies 

We reason by induction. The step i = 0 is true by hypothesis (recall Definition 
2.17). For the induction step, note that the s' defined by Choose_step satisfies (we 
omit the subindices i in (Bi,Qi w i) hr the next few lines) 

_£1_<_£1_< s ' <_£1_. 

n(B,(,w) V3 fi{B,C,w) n{B,(,w) 


Moreover, s" satisfies 
f5 + (s") = 


T s" + /3(B, C, w) + | (%/fi(B, (, w) 

1 - 3ci 

s" + /3(B, C, w) + |c|\/3/r _ Cu < c u 

1 — 3ci r ~ n(BX,w)' 


and 

P + (s") 


$ s" + /?(£, C, w) + I cj/n(B, C w) > <5>s" + /3(B,C,w) + |c|/r 
1 — 3ci ~ 1 — 3ci 

c u - fcfQ/3 - 1)/(1 - 3ci) > c u - |c^(\/3 — 1)/(1 - 3ci) > d u 

r ~ V3n(B,(,w) ~ n(B,t,w)’ 


We are thus under the hypothesis of Proposition 5.7 with s = As (by construction 
in Choose_step), which guarantees that (Cii w i) is an approximate eigenpair of the 
matrix Bi + \. Moreover, Proposition 5.7 also implies 




Kc * 

M(-®j+i i Aj+i , Vi-i- 1) 


Since AT* < 1/5 we deduce (using Theorem 2.12) that (<£, vj t ) is an approximate 
eigenpair of B l+ \ with associated eigenpair (Aj+i,Uj+i). Consequently, after three 
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steps of Newton iteration, we have that (O.+i, itfj+i) satisfies (recall, K = 64) 


d\st Bi+1 (((i+i,w i+1 ), (Aj+i,v,+i)) < 


1 64c* 

2 23_1 Ai + i, Vi + \) 

c* 

2/i(-B.j_)_i, Aj+i, 


( 35 ) 


If |Ci+i I < 1) this finishes the induction step. Otherwise, the algorithm divides C*+i 
by its norm, in that case we have 


distg. 


i +1 


0+i 

ICi+il 


w i+ i , (A i+ i,u i+ i) < distg. 


'*+i 


O+i 

IO+i I 


,Wi +1 , (0+l,^i+l) + 


dist B ((O+i, m+i), (A i+ i, v i+ i)) 


< IO+i I — i + 


2/r(5j+i, Ai+i, Vi- |_i ) 
On the other hand, from (35) we have (use |Aj+i| < ||.Bj + i||g = 1) 

IO+i — Aj+i| < 0 ---——- => |0+iI < 1 + 

so we have 


dista 


*+1 


2/r(fij+i, , Uj_|_x) 

0+1 


2/r(i3.j+i, ) 


10 


i+11 


,Wi +1 , (Ai+i, «i+i) < 


/i(i?j+i, Aj_|_i, Uj_)_i) 


and the induction step is finished also in the case that |0+i| > 1- 

The induction step is complete. In particular, this shows the last part of the 
statement. 

To show the complexity bounds, assume Path-follow has performed q+i iterations 
and let 0 = -so < si < ... < s q < ... < s q+ g be the corresponding values of s. Then 
we have 

(H1 {Bs)) = ^ ] / M(-^S, Ag, V s ) || (j4 s , \ s , V s ) || ds 

1 = 1 4sq + i-1 
l 

= -I y M + < J +i-l+q + i( 7r i (^s)) 0 

1 = 1 


the inequality by the last claim of Proposition 5.7. But the algorithm halts as soon 
as s q - 1 _£ = a, i.e., as soon as 

I J H,s q ,s q+ i{' IT 1 (As)) = L /JjSq0 ,( 7T 1 (A,)), 

which occurs as soon as £ > c^ 1 J'° n(A s , X s , i> s )||(.A s , A s , h s )|| ds, as claimed in the 
theorem (note that C := c)T 1 < 3000). 
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We finally prove Proposition 5.8. From Proposition 5.7 we have 


y 2 (B, (, w) As > 

(29) 

> 

Prop. 2.7 


1 f As 

(1 + 4V3C4) 2 l f (A >’ X ” v > )is 
6(1+4\/3c 4 )2 i "- ai - ( ’ ri 1(A * )) - 6(1 + 4v / 3c 4 ) 2 


so Proposition 5.8 follows. 


□ 


6 Integration in the solution variety 

6.1 The coarea formula 

On a Riemannian manifold M there is a well-defined measure voIm obtained by 
integrating the indicator functions 11 a of Borel-measurable subsets ACM against 
the volume form dM of M, 


vol m (A) : = / ll,i dM. 

Jm 

Dividing 11 by vol m(M) if vol m(M) < oo, this leads to a natural notion of uniform 
distribution on M, which we will denote by (M). More generally, we will call any 
measurable function /: M —>• [0, oo] such that f M f dM = 1 a probability density 
on M. 

The coarea formula (a modern classical formula due to Federer [25], see the 
Appendix of [27] for a smooth version) is an extension of the transformation formula 
to not necessarily bijective smooth maps between Riemannian manifolds. In order 
to state it, we first need to generalize the notion of Jacobians. 

Suppose that M, N are Riemannian manifolds of dimensions m, n, respectively 
such that m > n. Let if: M —>• N be a smooth map. By definition, the derivative 
Dif(x): T X M —>• T t j,^N at a regular point x € M is surjective. Hence the restriction 
of Dif(x) to the orthogonal complement of its kernel yields a linear isomorphism. 
The absolute value of its determinant is called the normal Jacobian (sometimes 
called normal determinant in the context of Linear Algebra, see [1]) of if at x and 
is denoted by NJif(x). We set NJ^(x) := 0 if x is not a regular point. 

If y is a regular value of if , then the fiber F y := if^ 1 (y) is a Riemannian sub¬ 
manifold of M of dimension m — n, and it makes sense to integrate functions on F y . 
Moreover, Sard’s lemma states that almost all y £ N are regular values. 

We can now state the coarea formula. 

Theorem 6.1 (Coarea formula) Suppose that M,N are Riemannian manifolds 
of dimensions m, n, respectively, and let if: M —> N be a surjective smooth map 
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such that Dip is surjective a.e. Put F y = ip 1 (y). Then we have for any function 
X : M —>• M that is integrable with respect to the volume measure of M that 


L xdM =L{L^ dF, ) dN ’ 

and the integrals involved are well-defined. □ 

It should be clear that this result contains the change of variables formula as a 
special case. Moreover, if we apply the coarea formula to the projection 712 : M x 
N —>• N, (x,y) i —> y, we retrieve Fubini’s theorem since Nj7r2 = 1. 


6.2 Coarea formula and double fibrations 

The coarea formula can be readily applied to the following situation. Assume that 
three Riemannian manifolds M, N\, N 2 are equipped with surjective smooth map¬ 
pings 7Ti: M —y N\ and 7T2: M —> N 2 whose derivatives are a.e. surjective, so NJ7 Ti 
and NJ7T2 are a.e. nonzero. Let y: M —>■ [0, 00 ) be a measurable mapping. From 
Theorem 6.1 applied to 7Ti we have (here dx and dy stand for the volume forms in 
M and Ni, respectively) 


ixeM 


y(x)Nj7Ti(x) dx = 


lyGNi Jx£tt 1 1 (y) 


x(x) dx dy. 


On the other hand, Theorem 6.1 applied to n 2 yields 

f NJtt^x) 


IxeM 


y(x)Nj7Ti(x) dx = 


' Z£N2 JX^7T 2 1 (z ) NJtt 2 (x) 


x(x) dx dz. 


We thus have the following result. 


Theorem 6.2 Let M, N\, N 2 be Riemannian manifolds equipped with surjective 
smooth mappings tt\ : M —>• N\ and 112 : M —> N 2 whose derivatives are a.e. surjec¬ 
tive. Let x- M —>• [0, 00 ) be a measurable mapping. Then, 




x( x ) dx dy 



NJ7Ti(x) 
NJ7T2 (x) 


x(x) dx dz. 


(Note that when M and N\ have the same dimension, one can replace the inner 
integral in the left-hand side by a (finite or enumerable) sum). □ 


In the next sections we shall apply this result in two different contexts. A linear 
algebra argument simplifies the computation of the quotient of Normal Jacobians. 
Let E and F be finite dimensional, complex Euclidean vector spaces and let (p: E 
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F be a surjective linear mapping. Consider the graph T := {(x, tp(x)) \ x € E} of p. 
Then, T is a linear subspace of E x F and the two projections 

pi : T —> E, (. x , p{x)) i->- x, P 2 : T —> F, (x, p(x)) >->■ p(x) 

are linear maps. Note that p\ is an isomorphism and P 2 is surjective as p is. 


Lemma 6.3 Under the above assumptions, we have 


NJpi 

NJp2 


det(^*)| 1 . 


Proof. This result is [12, Lemma 3b), p. 242], although we rewrite it for complex 
vector spaces here (note the comment in the proof of [12, Theorem 5, p. 243]). 

□ 


6.3 The solution variety for the eigenpair problem 

Recall from §2.2, we have the two projections 

to: V ^C nxn ,(A,A,v) ->A and tt 2 : V -> C x P(C n ), (A, A, v) -> (A, v) 

and, for (A, A, v ) € V, the linear operator A\ v : v 1 - —>• r -1 given by P v ± (A — A ld)i„j. • 
In order to apply Theorem 6.2 we first need to compute the quotient of normal 
Jacobians there. 


Proposition 6.4 Let p := (A, X,v) € VV and choose a representative such that 
||r|| = 1. Then, the derivative Dtt\ (p): T p V\J —>■ TaC nxn is an isomorphism, the 
derivative Dn 2 (p ): T P W —>• T^x )V ){ C x P(C n )) is surjective, and we have 


NJvri(p) 

NJ7r 2 (p) 


det v 4 a ,^| 2 


det^A^yJ- 


Proof. By orthogonal invariance, we may assume without loss of generality that 
v = ei = (1, 0 ,..., 0), and then 


A = 




c € C n_1 , B e d n “ 1)x ( n-1) » 


so A\^ v = B — Ald„_i. Let T = {(A, DG(A)A) : A € C nxn } C C nxn xCx® 1 where 
G is the appropriate branch of the solution map defined in some open neighborhood 
of A. We are under the hypotheses of Lemma 6.3 so we have 


NJ7T1 ip) 
NJ7T 2 (p) 


NJ(Lhri(p)) 

NJ(t4vr 2 (p)) 


det{DG{A, A, v)DG{A, A, u)*)" 1 . 
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From Lemmas 3.1 and 3.2, we have 

'(Av,v- i C nA~* v P v ±A*v) 


DG(A, A, v)A = 


~ A \l P iP Av 


v* - v*Ai C nA x ^ v P v i 

-K p v- 


Av = RAv, 


where, in the last displayed formula, we are denoting by R: C n —>• C x v 2 - = C n the 
linear operator multiplying by Av in the previous formula. A standard linear algebra 
argument then shows that det(DG(A, A, v)DG(A, A, v)*) = det (RR*) = |det(i?)| 2 . 
Now, we can identify 


ZCn 


(idl)- p - s (° ld ”N. 


which implies that in standard basis we have 


R = 


1 * 

0 -^,1 


thus showing that | det(i ?)| 2 = | det A a *| 2 , and the proposition follows. □ 

We are now ready to rewrite Theorem 6.2 in this setting. The following is an 
important technical result that we will use several times. 


Proposition 6.5 Let V —>■ [0,oo) be a measurable mapping. Then, 


I Ae C n 


X x(A A, y )dA 

\,v:Av=\v 


1 (X,v)eCx¥(C n ) JA:Av=Xv 


x(A, A, i>) | det(A ^ it ,)| 2 dAd(\, v) 


Moreover, assume that x JS unitarily invariant in the sense that x(A A, v) = 
x(UAU*, X,Uv) for any unitary matrix U. Fix any a.e. continuous mapping 
C n \ { 0 } -A- U n , v eA-XJy such that U v e\ = v/\\v\\ for all v. Then, for every A € C nxn 
and (T > 0, 


E 


A~N cnxn (A,cr 2 ) \\ v: Av=\v 


X x(AA,i 


(36) 


1 


E E 


r(n)cr 2 ( n ^ v ( X,W,B ) 


e X 


A w* 

0 B 


, A,ei ) | det (5 - A/ n _i )| 2 ) , 


where v € P(C n ) has the uniform distribution, y v = P v ± Au/||u||, and A A/c(A,a 2 ), 
w ~ cr 2 ), and B ~ A/ r C ( n -i)x(n-i) (B, cr 2 ) are independent Gaussian random 

variables centered at 

A := W := J*U*A*U v e i* B := J*U*AU V J. 

IMr 

Here, J is the n x (n — 1) matrix whose columns are e-i,...,e n (and, hence, J* = 
[0 ld n _i] is the matrix of P e ± : C n —>■ e\). In particular, if A = 0 then y v = 0, A = 0, 

w = 0 and B = 0, so v can be removed from the expected value in (36). 
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Proof. The first claim of the theorem follows directly from Theorem 6.2 and 
Proposition 6.4. 

For the second claim, let I be the left-hand side of (36). Change x(4., X,v) to 

2 _\\A-A\\ 2 f 

(a 2 ir)- n x(A,\,v)e in the first formula to get 


I = 


(<7 2 7r) n J(A,-y)eCxP(C") JA:Av=\v 
Note that {^4 : Av = An} can be parametrized by 


||A-A|||, 

x(A, X,v) I det(^A,-y)| 2 e ^ dAd(X,v). 


(w, B) i->- A = U v 


A w* 
0 B 


U* 


where w € C n— 1 , B € (£j(n—i)x(n— l). This parametrization preserves distances; 
moreover | det(^4A,v)| = | det(A/ n _i — B)\ and from the fact that % is unitarily 
invariant, we have 


I = 


(cr 2 7r) n J(x, v , w ,B) 


X 


A w* 

0 B 


,A,ei) | det(A/ n _i — B)\ 2 e ^ d(\,v,w,B) 


\\A-A\\ 2 f 


where A is given by the formula above. Note now that we can write 

12 


\\A- A\\% = 


0 fl I - U Z AU » 


= |A - e\U*Au v ei\ 2 + \\w* - e\U* v AU v J\\ 2 + ||B - J*U*AU V J\\ 2 F 
+ \\J*U*Au v e\ || 2 

12 + || B - J*U*AU v J\\'f + 


v*Av 

2 


a || no 

+ 

w - J*U*A*U v e i 

\\v\\ 




lv II 2 , 


and the second claim of the theorem follows noting that the volume of P(C n ) is 

7T n_1 /r(n). □ 

6.4 The linear solution variety 

It will be useful to consider a geometrical scheme similar to that of §6.3 for the case 
of solving linear systems: we consider 

V lin = {(M, v) € C (n " 1)xn x P(C n ) : Mv = 0}. 

The linear solution variety V lm is a n(n — l)-dimensional smooth submanifold of 

C (n-i) 

xn x P(C n ), and again it inherits the Riemannian structure of the ambient 
space (cf. [16, (17.14)]). 
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( 37 ) 


The linear solution variety is equipped with two projections 

7r lin : V lin -> c( n-1 ) xn and nf : V lin -> P(C n ) 

(M, v) i->- M ( M, v ) i->- v. 


For M G C^ n 1 ) xn ; (7r lm ) 1 (M) is a copy of the projective linear subspace corre¬ 
sponding to the kernel of M in P(C n ), and for v G P(C n ), (tt^") -X ( v ) is a copy of 
the linear subspace of C( n_1 ^ xn consisting of the matrices A such that Av = 0. 

We can apply Theorem 6.2 for integrating functions in V lm using the projections 
in (37). In this case, the tangent space to V lm at (M,v) can be identified with 

{(M,v) : Mv + Mv = 0, v*v = 0} = {(M,v) : v = <p(M)}, <p{M) = —M^Mv. 


Note that ip is a linear mapping defined from C(” -1 ) xn to u -1 . A routine computation 
shows that, if ||u|| = 1, then pp* = M'(M')*. Writing down the singular value 
decomposition of M, it follows that det(<^</?*) = det(AfAP) -1 . From Lemma 6.3 it 
follows that 


NJ(vr lin )(A/,u) 

NJ(vr^ n )(Af,u) 


det(MM*)|. 


(38) 


Proposition 6.6 Let <^ lm : V lin — >• [0, oo] be a measurable unitarily invariant func¬ 
tion in the sense that cj) Un (M,v) = for any unitary matrix U G Lin- 

Then, 

r E (</> lin (Af,ker(Af)) = -i- E (<)» lin ((0 B), ei ) | det(H)| 2 ). 

M ~N c (n-l)xn r(n) S~A/ C ( n _i) x ( n -1) 

Proof. Let %(Af) = ker(M))e~^ M ^F. Theorem 6.2 and (38) imply that 




x(M) dM 


IvG P(C") JM:Mv =0 


det(AL)| 2 y(Af) dM dv. 


Now, | det(AL)| 2 x(AL) = | det(MU)\ 2 x(MU*) for all U G U n by hypothesis. Hence, 
by parametrizing {M : Mv = 0} by {(0 B)U* : B G c( n-1 ) x ( n-1 )} where U v G U n is 
any matrix satisfying U v e\ (we are assuming ||u|| = 1), we conclude that the inner 
integral in the right-hand side of the formula above does not depend on v. We thus 
have 


I Me 


X {M)dM = vol(P(C n )) 


I Be 


det(H)| 2 x((0, B)) dB. 


The proposition follows from the form of the Gaussian density (recall §2.6) by noting 
that vol(P(C n )) = 7r n_1 /r(n). □ 

Assume now that we are given a measurable nonnegative function 
a: Ck n-1 ) x ( n-1 ) —> [0,oo]. We can produce a unitarily invariant function defined 
on V lm as follows 

(j} m (M,v) = E (■ a(MQ )), 

Q:(M,Q)~A n 
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where A n is given in Definition 2.30 (observe that Q follows the natural, uniform 
distribution on the Stiefel manifold). Note that 


</> lin ((0 B),e\) = E (a(BU)). 

U&Un-i 


It is a simple exercise to check that (j} m is unitarily invariant in the sense of Propo¬ 
sition 6.6. Applying Proposition 6.6 to </> lm then yields 


E E ( a(MQ )) 

M Q:{M,Q)&Au 


E E (a(BU)\det(B)\ 2 ). 
I (n) b u~u n - 1 


Finally, using Fubini’s theorem, we can interchange the integration order in the 
right-hand term, and then note that the isometry B i— )• BU preserves the value of 
the integral inside. We obtain the following corollary. 


Corollary 6.7 Let a: C^ n 1 ) x ( n P —>■ [0, oo] 


be an a.e. continuous function. Then, 


E E ( a(MQ )) 

M Q:(M,Q)&A„ 


1 

fTA , E 

r(?r) B~AA c ( n _i) X ( n _i) 


(a(B)\ det(H)| 2 ). 


□ 


7 Proof of Theorem 2.14 

We begin with the following result. 

Proposition 7.1 The following inequality holds for every A € C nxn and a > 0. 

E _ (||A^ 1 ||^ |det(A)| 2 ) < ™ E _ (|det(A)| 2 ). 

Furthermore, equality holds if and only if A = 0. In particular, 

E (||A- 1 ||^|det(A)| 2 ) =m\ma 2m - 2 . 

X m (0,(7^) 

Proof. Expanding the determinant of A by the kth column we have 

m 

det(A) = ^(-l) J+fc a jifc det A j ’ k t 

3 =1 

where A^ ,k denotes the matrix that results from the matrix A by removing the jth 
row and fcth column. Hence, 

m 

|det(A)| 2 = det(A)det(A) = ^ (—i)J+i'+2fc a ^ fc ajr^ det A^ k det A>’’ k . 

j,j '=i 
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Observe that the random variables cij^ and cij^k are independent of det A J,k and 
detA^' ,k . Then, 


m 

E | det(^4)| 2 = (-l) i+i ' +2fc E (a jtk ajTk) E(det A j ’ k det AS'*). 

^~AC cmxra (A,(T 2 ) j,j '=1 


Now observe that 


1E(®j,fc ®j',fc) 


dj,k 0>j',k if j / J i 

a 2 + \aj t k\ 2 otherwise. 


We conclude that for k = 1,..., m, 


E _ | det(^4) | 2 

Ar^jf j-, m x m {A,(7^) 


E | det([v4; k\ A k ])\ 2 + a 2 ^ E | det A j ’ k | 2 

i=i 


(39) 


where [A\ k ; is the matrix formed by replacing the (random) fcth column of A 
by the (deterministic) kth column of A. 

On the other hand, from a direct application of Cramer’s Rule and (39), we 
deduce that 

m 

a 2 E ||A -1 |||i | det(^4)| 2 = a 2 ^ E | det A^ k \ 2 

A'^J\f l £rnXrn(A,(T‘2') j^ = -^ 

m 

= m E | det(yl)| 2 - ^ E | det ([A; k; A k ])\ 2 , 

k =1 


and the first claim of the proposition follows. Moreover, when ^4 = 0, the last term 
in the sum above is zero. We leave the proof of the converse to the reader. Using 
(39) and the fact that the matrices A^ k are A^( m -i)x( m -i) (0, cr 2 )-distributed, one 
can prove working by induction the equality 

E \det(A)\ 2 = a 2m m\. 

^~.'N C mxm (0,<X 2 ) 

The second claim of the proposition follows. □ 


Corollary 7.2 


E (\\M'\\ 2 F ) = n- 1. 
Mr ^'J'f<c( n — l)xn 


Proof. From Proposition 6.6 with <)> lin (M, £) = we have 

1 


E (H Mt llF) = ^T D E (115- 

M ~A/ C ( n _ i > xn b(ra) R~A/ c ( n _i) X ( n _i) 


F ,det(5)| z ) = n- 1 

Prop. 7.1 


as claimed. 


□ 
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Corollary 7.3 For any B € C^ n X ) x ( n 1 ) J a > 0, and A € C, we have 

E (||(i? - \I n -i)-% | det(5 - A/ n _!)| 2 ) < (I det(5 - AJ n _i)| 2 ), 

where B ~ jV C (r,-i) X (n-i) (B, o 2 ). 

Proof. Note that 

E(||(5-A/ n _i)- 1 || 2 ,|det(5-AI r) _i)| 2 ) = E , (||C'- 1 || 2 , | det C| 2 ), 

B C~A/ c ( n _i) x ( n _i)(C,cr 2 ) 

where C = B — \I n -\. The proof readily follows from Proposition 7.1. □ 


Proof of Theorem 2.14. Fix any a.e. continuous mapping v H > U v such 
that for v € P(C n ), U v is a unitary matrix with Ueo = u/||a||. From (36) applied to 
x(A,X,v) = ^ 2 f {A,\,v)/\\A\\ 2 f = yWA-'j 2 , we have 


E 




(40) 


1 


ttt E E (e~— ||(5-AI n _ 1 )- i ||^|det(5-A/ n _ 1 )| 2 ), 


nT(n)o 2 ( n P v ( \, W ,B ) 

where y v = P„± Au/||r;||, v € P(C n ) has the uniform distribution, and A A/c(A,a 2 ), 
re ~ N£n-i(w,o 2 ), and B ~ .A/" C (n-i)x(n-i) (B, o 2 ) for some A ,w,B which depend 
uniquely on A and v. 

From (40) and Corollary 7.3 we have 


< 


E 

A~A/' C nxn (4,cr 2 ) 

n — 1 


VLvO 4 )' 


■E E (e T2-|det(5- A/ n _i)| 2 ). 


cr 2 cr 2 ( n -l)r(ra + 1) v (X,w,B) 

Now, if we apply again (36) to the constant function y = 1/n we get 

1 


1 = 


E 


( 1 ) = 


A~A/’ c mxm(4,0-2) T(?r + 1 )o 2 ( n !) v (A ,w,B) 


E E (e ~~F~ | det(5 — A/ n _i)| 2 ), 


and we have thus proved that 

E 

A~A/' c „xn(4,a-2) 


Ff, av(^) \ < n - 1 < n 


- (J 2 (T 2 ’ 


as claimed. 
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let 


This proves the first part of Theorem 2.14. For the second part of the theorem, 

I = E p 2 Fay {A) 

A~W( S) 


be the quantity we want to compute. From the first part of the theorem (with A = 0 
and a = 1) we have 


_L [ v 2 fA a ) c-mi d A< 
Pill 


n. 


(41) 


On the other hand, 

P.avP) _ 


-4/ 

vr n JagC” 




7T 


0 P JA:\\A\\ F =p 


V 2 F, a v( A )dAdp. (42) 


Now, because pF,av(A ) is invariant under multiplication of A by nonzero complex 
numbers, denoting u p = volfA : pp = p), we have 


- / 

V P JA:\\A\\ f = p 

We deduce from (41-43) that 


PF, avP) dA = I, 0 < p < oo. 


(43) 


Note now that 

to conclude that 

I < 


T I" 00 v e~ p 
-4/ dp<n . 

vr" 2 7o P 2 


„ 2 n — 1 


T(n 2 


nr(n 2 


r?F (n 2 


2 /o°° p 2n2 ~ 3 e~ p2 dp T(n 2 -1) 
The theorem follows. 


= n(n 2 — 1) < ri 3 


□ 


8 Proof of Propositions 2.21 and 2.23 

8.1 Proof of Proposition 2.21 

First note that Path-follow starts by normalizing the input, so from (14) we can as¬ 
sume that pop = 1 and A A/pxnpO, 1) where T = \/2n. From Remark 2.22, 
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for integration purposes we can also assume that {Ca 0 ,a \ {A)}) H £ = 0. Corol¬ 
lary 2.20 with q = 1 implies that 


Avg_Num_lter(A, A 0 ,u 0 ) = E K(A, A 0 , X 0 ,v 0 ) 




CnXn p 


< 2 + V6 C E P||f [ — 


H 2 (A t ,\ t ,v t ) 


dt 


<2 + V6C E 


r»l n 


A~Afg 


0nXn p 


E 




3=1 


IP 


t\\F 


(44) 

dt, 


where A t = (1 — t)A$ + tA and the pairs (X^\v\^) are defined by continuation 
for all the eigenpairs of At- Note that we need to write 2+ instead of 1+ as in 
Corollary 2.20 because we need to bound the smallest integer which is greater than 
the integral, rather than the integral itself. 

We therefore have (use T = sf2n) 


Avg_NumJter(APo)'Wo) < 

< 

(13) 


2 + VGCuT E 

A~J\f C n x n p 

2 + ViSCn 2 E ( 

Ar^J\f j-, n Xn ' 



hav(A) 

II Ml 



Mav(A) rU \ 

II Ml dt J ' 


(45) 


In order to bound the last term in the previous expression, we interchange the order 
of integration, 


E 

.4 ..- tl x n 



Aav(A) 

IIAIIf 



E 


Iti 


C nXr 


V IIAIIf ) 


dt. 


Now, for fixed t, if A ~ A/" C nxn then A t = (1 — t)Ao + tA satisfies A t ~ A/c«xn((l — 
t)Ao,t 2 ) and from Theorem 2.14 we have 


E 

A^Af C nxn 


V IIAIIf ) 



which implies 


E 

4 ~,V r- ,1 X n 



Aa vPt) 

IPP 



fi 


< 


n , n 
—t dt. < 
t 2 ~ ti 


(46) 


We are thus left with the task of evaluating t\. But we have (recall from Corol¬ 
lary 2.20) that 


t] 


1 

|| A ||_p(sin a cot si 


1 

- > - . 

cos a) + 1 T(cot si + 1) + 1 
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We now use that si = Choose_step(Ao, A, Ao, vq) (the length of the first step in the 
execution of Path-follow) and this is at least ^ ^\ Q .,, 0 ) by Proposition 5.8. Hence 

cot si < jj- < ^ ( A °Ao,vo) an d ^ follows that 


tl ^ (n// 2 (A 0 , A 0 ,u 0 )) ’ 

Putting together this bound and inequalities (45) and (46) we deduce the claimed 
bound for Avg_Num Jter(Ao, Ao, no). 

We next prove the smoothed analysis bounds. Reasoning as in (44) we see that 
the smoothed number of iterations SmcLNum Jter(Ao, Ao, vq, a) is bounded by 


2 + V6C sup 

AeS Ar 


E 


‘K 




rl ,,2 (A \O') ,.U)\ 
A i v t ) 


j = 1 


WMl 


dt. 


The rest of the argument is almost exactly as above, the only difference being the 
bound ||x4||i? < T + ||A||,p = y/2n + 1. □ 


8.2 Proof of Proposition 2.23 

We are now following all the n paths (each starting with a different eigenpair of Ao). 
Applying Corollary 2.20 with q = 1 to each of them we obtain 


Avg_Num_lter(Ao) = E K(A,Ao,X^\v^) 

A~AC 


C"X»,T j = l 


<2 n + VQC E 


n A , ,2 




£n X n t 


■E 


WMl 


j =l b 


dt (47) 


<2 n + V6C E ||2i||F 

X n jp 


^ ” M 2 (Ai, A?W>) 

^ ^ Pdll ’ 


)=i 


where = minjf^,..., t^}. We can now reason as in the preceding proof to 
deduce that 

Avg_Num_lter(A 0 ) = 


(7) 

as well as the fact that, for j = 1,..., n, if > P 
these bounds that 

t* > n 


n[A (A o,A(^) juU )) 


1 

n^max(^o), 

The rest of the proof follows as in the preceding proposition. 


It follows from 


□ 
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9 Proof of Theorem 2.28 


We begin with an auxiliary result. For simplicity, in what follows we write § := 
S(C nxn ). We also consider the manifold 

5 = {(A, A) eSx§:ie T A §} 

and denote by (<S) the uniform distribution on it. Given any measurable mapping 
4>: S —>• [0, oo], let 

/ /-ds(Ao,A) \ 

h ■■= E / (f>(A a ,As)ds\ , (48) 

A 0 ,A~^(S) yj 0 J 

where, as usual, the A s are such that {A s : 0 < s < d§(Ao, A)} = Ca 0 ,a , and A s is 
the unit tangent vector (in the direction of the parametrization) to C-Aq.a at A s . 

Lemma 9.1 For any measurable mapping <j>: § x § —>• [0, oo] we have 

h = F E <KA,A). (49) 

z (A,A)~^(5) 

Proof. We consider the manifold 1Z = {(Ao, A, s) G SxSx (0, ir) : s < dg(Ao, A)} 
and let 

'R' —^ S 

(Ao>A, s ) H > (xl s , x4 s ). 

We can then write 1$ = vol(§)~ 2 (j) o ijj. Applying the coarea formula (Theo¬ 

rem 6.1) yields 


A, = 


vol(S) 2 


/ (A,A)e<S 


<f>(A,A)q(A,A)d(A,A), 


(50) 


where 


W 




Our goal now is to prove that q(A, A) is a constant (independent of A, A). It shall 
be useful to consider the two diagonal matrices A = E\\ and A = E 22 where Ejj 
denotes the standard basis of C nxn . Note that (A, A) G S. 

We now fix (A, A) G S. Let a: C nxn —>• C nxn be an isometric change of basis 
such that <7 (A) = A and <7 (A) = A. Denoting as = a x a and an = axaxI r, where 
Jr is the identity mapping in M, it is easy to check by writing down the formula for 
A s that iJj o on = as o ip. We are under the hypothesis of [12, Lemma 4, p. 244] 
(which holds for surjective maps in general, not only for projections), proving that 


NJ(^)(^(C,Co,s)) = NJ(^)(C,C'o, S ), V (C, Co, s) G 1Z. (51) 
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Moreover, the mapping an is an isometry from ip l (A, A) to ip 1 (A, A), 

which from the change of variables theorem implies 


q(A,A) = 


1 


■ d(B 0 ,B,s ) 


W-B^e^-hAA) NJ {ip){a^(B, B 0 , s)) 

1 1 d{B 0 ,B,s) = q( A, A), 


( 51 ) ^(B 0 ,B,s)ei/'“ 1 ( A A) NJ(V0(-®> 5 0 , s) 


which proves that q(A, A) is equal to some constant C. 

Since this holds for any measurable function </>, we can take the latter to be 
constant with value 1 to derive the value of C. Then (50) becomes 


h 


vol(S) 2 


lA,AeS 


C d(A, A) 


vol (£) A 

vol(§) 2 


And it follows from (48) (always with <p = 1) that 


I<t> = E d§(A 0 ,A). 

A,A 0 ~W( §) 


Hence, 


E d§(A, A 0 ) 

A,Ao r ^ , ^ r (S) 


A vol (£) 
vol(§) 2 


Note that the change of variables Aq t-A —Aq does not change the expected value in 
this last formula. Moreover, d§(A, Aq) + d§(A, — Aq) = i r for all A,Aq € §. Thus, 


vol(5) 

vol(§) 2 


E d§(A, Aq) + E d§(A,—Ao) 

AiAo^ty (§) A,Ao~W (§) 

E (d§(A, A 0 ) + d§(A, —A 0 )) = ir. 

A,Ao~ty (§) 


proving that 

A vol (§) 2 ^ 

2vol(«S) ’ 

From (50) we conclude that for any measurable nonnegative function <p we have 


= E 4>{A, A), 
z {a,A)~w(S) 

as wanted. □ 

Proof of Theorem 2.28. Consider the measurable function cp: S -A [0,oo] 
defined by 

4>(A,A) = ^ A, f)|| (A, A, v) ||, 

(X,v):Av=Xv 
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for A € § and A G TA§ such that A 0 E, where A, v are the functions of {A, A) and 
(A,w) given in Lemma 3.2. (If A £ S we set cj)(A,A) = oo.) 

From Theorem 2.18, denoting / = we have (for some constant C > 1) that 


E 

A,Ao^Af cn xn 


E K(A,A 0 ,X 0 ,v 0 ) | <CJ. 

Aq ,vo :Aqvo= Aoro 


(52) 


Note that the left-hand side of (52) is the quantity to be bounded in Theorem 2.28. 
It is therefore enough for us show that I < 4n 2 . To do so, write 5 a := {A 1 : 
{A, A 1 ) € 5} C Ta§, for A €E §. First note that £4 is just the unit sphere in Ta§, so 
it has a natural volume form inherited from C nxn and voI(5a) is independent of A. 
Moreover, the Normal Jacobian of the projection S -A §, {A, A) A, is constant 
and equal to I/a /2 (this is easy to prove: check that for A € the pair of the 

form {A, A') in T(a,A')S which is orthogonal to the kernel of the derivative of the 
projection is (yl, — Me(yT, A)A), then note that the vectors of that form obtained 
from any o.g. basin of TA§ whose first element is A' are orthogonal and only one 
of them, (A',—A'), changes its norm by y/2), so we have vol(5) = \/2voI(§)voI(5a). 
From Lemma 9.1 and Theorem 6.1, we then have 


I =5 E </>(A,A) 

z (A,A)e<s 


7r 

-P E 
V 2 AeS 


(t E v(A,\,v) E (||(i,A,h)||) 

\ ( \,v)--Av=\v ^ £<Sa 


(53) 


In order to estimate this last quantity we shall use Lemma 9.2 below. Note first 
that from Cauchy-Schwartz, 


.E (||(i,A,h)||) < 

As5a 


< 

Lemma 9.2 


< 

n> 2 


1/2 


1+ E (|A| 2 + |H| 2 ) 
4 . 


1 + 


1 


n 2 — 


— (l + 2fip{A, A, v ) 2 ) 


16 


( 9 + /HAAj) 


1/2 


1/2 


It is a simple exercise to check that for positive i£lwe have x(9 + 1 6x 2 /n 2 ) 1 ^ 2 < 
9n/8 + 4 x 2 /n. Using this inequality we get from the equations above: 


I < 


IT 


E 


\/l4 AeS 1 n 


( \,v):Av=\v 


9 n 


- E y + V(AA, 

n ' V 8 n 
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We next use Theorem 2.14 (averaging over A G S) and bound this last quantity by 


I < 


7r f 9n 4 


- 


Vu V 

which finishes the proof. 


+ — n ) = 
n 


77 


( 9 n 


- 


v^4 V 


+ 4n 2 ) < 4n 2 , 

n> 2 


(54) 

□ 


We have used the following technical lemma which is in the spirit of [2]. Note 
that as pointed out in the proof of Theorem 2.28, £4 is just the unit sphere in the 
tangent space XA§ and thus has a natural measure inherited from the inner product 

in C nxn . 


Lemma 9.2 Let (A,X,v) € V. Define Sa ■= {A' : (A, A') € 5 } C T 4 S as in the 
proof of Theorem 2.28 (that is, Sa is the unit sphere ofTA §) and, for A € Sa, let 
A, v be as in Lemma 3.2. Then, 

E (|A| 2 ) = -4- T b.A(AA,„) 2 -hl!), 

Ae5 A n 2 - 2 V 2 / 


and 



AgSa 

In particular, from Proposition 3.3, 




2 

F- 


E (|A|" + ||h|| 2 )< —-- (l + 2||A A j,||f^ — —- y (l + 2 Hf(A, A, vY) . 

Ags a n ~ 2 n 2 

Proof. Note that T 4 S coincides with the (real) orthogonal complement, with 
respect to 3of A € C nxn . Thus dini]R(Ij 4 §) = 2n 2 — 1. On this space we 
consider the push-forward measure of the standard Gaussian distribution on C nxn 
by the orthogonal projection C nxn —> T 4 S. 

Since may be split in the (real) orthogonal decomposition M-yAMT © A 
where M\/— 1A is the linear real subspace generated by y/—lA, in particular we 
conclude that the Gaussian distribution on TA§ coincides with the distribution 
t.y/—lA + B € Tu§ where t ~ JM(0, ^) and B A r A r are independent. 


Claim I: Given a linear operator L : TA§ —>• C fc , we have 


E (||L(i)|| 2 ) = (n 2 -i) E (||LU)f); 

AeT A S \ £ / As<Sa 

The claim follows integrating in polar coordinates. More precisely, 


E (llUhll 2 ) = -4-r f ||i(i)|| 2 e-M |2 <a 

AsTaS 7T 2 JagTaS 

= ~At Y°° P^e-^dp- j ||i(i)|| 2 rfi 

7T 2 Jo J AGS A 

= (» 2 -f) E (||L(.4)|| 2 ), 

V 1 / As5a 
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where we have used that / 0 + °°p 2n2 e p “ dp = |r(n 2 + ^), and voI(<Sa) = 
27r" 2 -Vr(n 2 - i). 

Claim II: The push-forward measure of the Gaussian distribution on A 1 - by the map 
f :A ± ->v ± ,Ai->P v ±(Av), is the standard Gaussian on v ± . 

Note that for all B € C nxn , we have (uv*, B)p = tr (B*uv*) = v*B*u = (u, Bv). 
Then, the set F = {wv* : w £ r 1 } is a linear subspace of and the kernel 
of / is the Hermitian complement of F as subset of H 2 -. Since / \f- F —>• u 2- is a 
linear isometry, the claim follows from the characterization of the standard Gaussian 
distribution. 

Claim III: Let m € N. If 77 ~ Nc™ , and x € C m then 

E (\{r],x)\ 2 ) = \\x\\ 2 . 

The proof of this claim is a standard exercise and is left to the reader. 

Now we are ready to prove the lemma. We chose a representative of v such that 
||u|| = 1 and a representative of the left eigenvector u such that (u,v) = 1. Note 
that this implies by Proposition 3.3 and Lemma 3.3: 

Ma(A A,u) = MUM = ||.u||, A = (Av,u), v = -A~ 1 P v aAv. 

\{u,v) 

For the first statement, we have 

E (|A| 2 )= E (|(iu,u)| 2 )= E (\{A,uv*)f\ 2 ) 

AeT A s Agt a s A&t a s 

= E E {\{tV^lA +B,uv*)f\ 2 )- 
Be M a ± teAT( o,i) 

Since the mixed term of the expansion of \{t\f—lA + B 1 uv*)f| 2 is linear in t, its 
expected value is zero. Hence, 

E (|(iu,u)| 2 )= E E (t 2 \{A,uv*) F \ 2 + \{B,uv*)f\ 2 ) 

Agt a § BeAT A ± teAT( o,±) 

= ^+ E (\(B,f a a(uv*))f\ 2 . 

where we have denoted by n A ±(uv*) the orthogonal projection of uv* onto A 
With the identification of A and C n ~ 1 as Hermitian spaces, from Claim III (with 
m = n 2 — 1) we conclude that 

E ( \{B,it a ±(uv*))f \ 2 = h A ±{uv*)\\ 2 F = ||«v*Hf - |A | 2 = |M | 2 - |A| 2 . 
BeAGx 
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We have then proved: 


E (|(iu,u)| 2 ) = nx(A,\,v ) 2 - 

Agt a § 1 

and from claim I we conclude: 

E (\(Av,u}\ 2 )= 1 (nx(A,\,v ) 2 - , 

de«s A i n ~2) V 2 ) 

as claimed in the lemma. The second statement in the lemma is proved in a very 
similar fashion. This time we have 


E (|u| 2 ) 
Agt a § 


E (\\A~lP v ,Av\\ 2 ) = 

AeT A s 


E E 

BeA/> teARO.i) 




—IA + B I v 


E 

.Be M 


A-L 


A \l P v±Bv 


= . E (Py>l| 2 ) 

wgJV, 1 


-1 


Pa,, 


112 

II 


the previous to last equality from claim II and the last coming from the fact that for 
any matrix B € C nxn , K x ^j\f C n px|| 2 = ||-B||j 7 (note the use of Frobenius instead of 
operator norm in the last equality: that is a crucial point). The last statement of 
the lemma then follows from claim I. 

□ 


10 Proof of Theorem 2.32 

10.1 Proof of (1) and (2) in Theorem 2.32 

Note that (2) is trivial. We thus prove (1). The procedure we suggest to choose 
uj € Q n at random is the following (note that each step requires 0(?r 3 ) arithmetic 
operations or random choices): 

1. Choose B ~ A/’ C ( ri -i)x(n-i) and let U be the Q factor in the QR decomposition 
of B, then multiply Q by the diagonal matrix with entries rn/\ra\ where the 
ra are the diagonal elements of the R factor. This produces a unitary matrix 
U uniformly distributed in U n -i, see for example [35]. 

2. Choose A ~ A/c and M ~ A f C (n-i)xn. Let H € U n be any unitary matrix 
such that its last column is in ker(M) (it is trivial to produce such an H by 
computing the QR decomposition of the matrix whose columns are ker(M) and 
the columns of M ). Compute Q as the product of the first nx (n—1) submatrix 
of H times U. This produces an element with the uniform distribution in the 
set of Q G 5„_i(C n ) such that (M, Q) £ A n . 
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3. If 25ft(Atr(MQ)) > 1 — |A| 2 (n — 1) then discard A, M, Q and repeat (1) and (2). 

4. Choose w 

The only subtle point is that steps (1) and (2) might have to be repeated an arbitrary 
number of times. The expected number of times that steps (1) and (2) will be 
repeated is related to C n defined in (21) by 


Prob(step k is reached) = ^ Prob (2-ft(Atr(M<2)) > 1 — |A| 2 (n — 1)) 

k= 1 k=l 

oo 

k-1 1 


k-1 




k=l 


(1 - (1 -Cn 1 )) 


= Cn 


10.2 Proof of (3) in Theorem 2.32 


We are now prepared for proving (22). Let 11 be the characteristic function of the 
set 

{(A, B ) : 2»(Atr(B)) < 1 - | A| 2 (n - 1)} C C x 

From the dehnition and Fubini’s theorem, for any measurable nonnegative function 
0 defined on V, the expected value E w ^n n (^>(ifi n (uj))) equals: 


C n E E E 

M Q:(M,Q)eA„ \\,w 


A w* 

0 MQ + AI n _i 


,A,ei 1(A, MQ) ) = 


C n E E ( a(MQ )), 

M Q:{M,Q)&Au 

where A ~ A/"c, M ~ A( C («-i)xn, w € A/c«-i and a: cb 1-1 )*!™- 1 ) —> [0,oo] is defined 
by 

“ (B) = iK(o b + aIJ-^) »(*■*> 

We are then under the hypotheses of Corollary 6.7. Using this result we obtain 

C n 


E OKV’nM)) = 


v-rw E (a(B)\ det(L>)| 2 ) 

r(?r) l)x(n-l) 


With the change of variables B + A/ n _i = D, which implies 

\\B\\ 2 f = \\D\\ 2 f + (n - 1)|A| 2 - 2K(Atr(Z>)), 
this last expression equals 


C n 

uA E 

I (n) A,W,D 


A w* 

0 D 


A, ei ) | det(L> - AT^-i)| 2 e -l A l 2 ^- 1 > +23fi (^ tr ( jD )) 11(A, D - A/ n _i) 
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where D ~ M C ( n - i)x(n-i). Now, note that 


1(A, D - XI n -i) + 0 <*=*> e -|A| 2 (n-l)+25R(Atr(i?)) < 
We have thus proved 


e Crt 


E O(V’nH)) < ^ i y' i i n n 

W~fln I (n) A,«>,D V V V U U 


E 


A w* 


, A,ei) | det(D - A/ n _i) 


1 




= eriC n E 

( 36 ) A~ATf.n-x.ri \ n X v . Av=Xv 

This proves claim (3) in Theorem 2.32. □ 

We prove the following (non-sharp) bound for the value of C n . 

Lemma 10.1 With the notations above, 

C n < 4 n. 

Proof. Note that if 0 < |A| < (n — l) -1 / 2 , then for any nonzero M € c( n_1 ) xn 
we have 

Prob (2K(Atr (MQ)) < 1 - |A| 2 (n - 1)) > Prob (2K(Atr(MQ)) < 0) = 

Q Q Z 

the last equality coming from the linearity of the trace. We thus have 


1 


Prob (2»(Atr(MQ)) < 1 - |A| 2 (n - 1)) > 9 

X ’ M ,Q 7T J |A|<(n-l)-V 2 2 


p I'M 1 — p n — 1 

dX = - 


We thus have 


as claimed. 


C n < 


y n — l 

1 — e - "- 1 


< 4(n — 1) < 4n, 


□ 


11 Proof of Theorem 1.1 

Consider the following algorithm. 
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Algorithm 7 Relative_Error 

Input: s G (0,1/2), A G C nxn , ((,w) € C x C n 

Preconditions: (A, £, w) G VV so (£, w) is an approximate eigenpair 

of A with associated eigenpair (A,u), \\w\\ = 1, ||A||i? = 1 

k:=0 

(C'X) := (C,«0 

repeat 

(C,w') := Na(C' i w ') (one Newton iteration) 
k := k + 1 

until k > log 2 log 2 (:) 

return 

Output: (£',«/) G C x C n 

Postconditions: The algorithm halts if A / 0. In this case, 
(A, £',u/) G W so (£', u/) is an approximate eigenpair of A with asso¬ 
ciated eigenpair and d§(w',v ) < e, and moreover |£' — A| < e|A|. 


By hypothesis, 


distA((C,w),(A,v)) <-^7T7 

Mmaxv^-J 


< 1. 


(55) 


Hence, |£ — A| < disUi((C, w), (A,u)) < 1 and the same bound holds with £ replaced 
by £ / at all the iterations of the algorithm (by Definition 2.11). Using that |A| < 
||A||i? = 1 , we deduce that |£'| < 2 . Hence, at the end of the repeat loop the value 
k satisfies 


> 


4 2 

e|C'| “ e 


(56) 


This inequality implies that after k iterations of the loop we have, from the definition 
of approximate eigenpair and the bound (55), that 


disU((C',m'),(A,u))<-^- 

Mmaxv-'v 



(57) 


In particular, d§(w',v) < £ as we wanted. On the other hand, the first inequality 
in (56) implies 

2 2k ~ 1 \C'\ - 1 > - - 1 > - 

£ £ 

the last since e < 1. We now use this inequality together with (57) to obtain 


|C' —A| ^ 1 1 _ 1 

T £ £ ( ic'i - pfe ) 22-1 _ 2 2 *-'lci -1 s 6 
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i.e., |C' - A| < e|A|. 

It remains to show that Relative_Error halts provided A / 0 and to estimate its 
average running time when A is drawn from Af^nxn. For this, we note that as soon 
as 


k > log 2 log 2 


|A|e 


we shall have (using (57)) 


IC'I > |A| - (i) 


2—1 


>|A|-hl = |A|(l-e/4). 


Therefore, we will also have 
4 


log 2 log 2 


elC'l 


< log 2 log 2 


e|A|(l - e/4) 


< log 2 log 2 —- < k. 


|A|e 


Hence, the stopping condition will hold after at most 

'sp- 1 ! 


log 2 log 2 


|A|e 


< log 2 log 2 


iterations (we have used that | Ap 1 < P _1 || for A is an eigenvalue of A). 

We finally estimate the average cost of Relative_Error. Since each iteration of the 
repeat loop requires 0(n 3 ) operations, this cost is at most 0(n 3 ) times 


Ar^Mg 


c nxr 


E log 2 log 2 


8P 


-H 


<log 2 log 2 - E (P ||) , 


e K 


where we have used Jensen’s inequality. Bounds for the expected value of |p 3 || 
when A Afcnxn are known, see for example [16, Prop. 4.22] which, together with 
the general inequality E{Z) 2 < E (Z 2 ), implies 


, E (IP _1 ||) < 

Ar^Jf C n x n 


e{n + 1) 


< ven. 


The statement follows. □ 
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